
'||''''|..|''||   .|'''.|  .|'''.|        '||                      
 ||  . .|'    ||  ||..  '  ||..  '   ...   ||  ...   ... ..... ... 
 ||''| ||      ||  ''|||.   ''|||. .|  '|. ||.|  '|.|| ||  '|.  |  
 ||    '|.     ||.     '||.     '||||   || ||||   || |''    '|.|   
.||.    ''|...|' |'....|' |'....|'  '|..|'.||.'|..|''||||.   '|    
                                                   .|....'.. |     
                                                           ''     
Copyright (C) 2008 Hewlett-Packard Development Company, L.P.

About
=====
FOSSology is a framework for software analysis, both source and binary.
It uses a repository for unpacking and storing the uploads, "agents" 
to analyze the uploaded files, and a Postgres database to store and display
the results. Also included is a license agent for scanning source code for
potential license texts.

Introduction
============
This document is designed to help you get FOSSology installed and ready
to use. It's intended audience is the system administrator who wants to
quickly get a local install up and running, or a distribution developer 
looking to create packages.

For extended discussion on how to use or tune the software, please see
the User Documentation, available on the website, in the doc/ directory
of the source, or the doc/ directory on an installed system (directory
may vary depending on where you install it).

Preparing your system
=====================
*** Disk space ***
FOSSology stores uploaded data in a filesystem repository.  As you
upload and analyze packages via FOSSology, the repository can grow very
large.  The default location for a single system repository is
/srv/fossology/repository/ however this can be adjusted by the system
administrator to another location if desired.

It is recommended that the area you choose to keep the repository in,
be a separate mount point with at least 4x the size of the unpacked
data you intend to scan.  For a small system intended to just scan a
few small personal projects this might mean gigabytes, but for systems
intended for scanning large collections of software including Linux
distributions, this probably means hundreds of gigabytes to terabytes.
If you are using multiple hosts to store the repository, it is best to
spread the repository data evenly across the hosts.  See the User Guide
for more information about using multiple hosts.

*** Dependencies ***
FOSSology uses lots of different existing tools and software and
expects to find it on the system, and also depends on certain system
users existing, and other adjustments to the system config.

FOSSology depends on recent versions of the following software:

Postgresql:
 8.1 or newer
 On Debian (or a Debian derived distribution) you should be able to use 
 something like:
  apt-get install postgresql-8.1 postgresql-server-dev-8.1

Apache2:
 FOSSology was developed using the Apache2 package included with Debian
 GNU/Linux 4.0 (codename "Etch").  (Most Linux distros include Apache.)
 On Debian (or a Debian derived distribution) you should be able to use 
 something like:
   apt-get install apache2

PHP5:
 On Debian (or a Debian derived distribution) you should be able to use 
 something like:
  apt-get install php5 php5-pgsql php-pear libapache2-mod-php5

Libraries (see next section to install with apt-get):
* libmagic - for determining file types, from the "file" software
   ftp://ftp.astron.com/pub/file/file-4.02.tar.gz 
* libxml2 - GNOME XML library
   http://gnome.org
* libextractor - GNU file meta-data extractor
   http://www.gnunet.org/libextractor/

 On Debian (or a Debian derived distribution) you should be able to use
 something like:
  apt-get install libmagic-dev libxml2-dev libextractor-dev \
	libextractor-plugins

External Commands:
* ar - for extracting archives, from the binutils software
   http://www.gnu.org/software/binutils/
* bzcat - bz2 decompressor, from the bzip2 software
   http://www.bzip.org/
* cabextract - extractor for Microsoft Cabinet files
   http://www.cabextract.org.uk/
* cpio - for extracting cpio archives
   http://www.gnu.org/software/cpio/
* icat and fls - forensics tools from the sleuthkit software
   http://sourceforge.net/projects/sleuthkit/
* isoinfo - read metadata info from ISO9660 images, from the
   mkisofs/cdrtools/cdrkit implementations
   cdrkit implemtation http://debburn.alioth.debian.org/
* pdftotext - from the xpdf software
   http://www.foolabs.com/xpdf
* rpm and rpm2cpio - for extracting software and metadata from rpm packages
   http://www.rpm.org/
* tar - tape archive decompressor
   http://www.gnu.org/software/tar/
* upx-ucl - an executable compressor/decompressor
   http://upx.sourceforge.net
* unrar-free - Unarchiver for .rar files
   https://gna.org/projects/unrar/
* unzip - De-archiver for .zip files
   ftp://ftp.info-zip.org/pub/infozip/src/
* zcat - for uncompressing .gz and .Z files, from the gzip software
   http://www.gzip.org/

 On Debian (or a Debian derived distribution) you should be able to use 
 something like:
   apt-get install binutils bzip2 cabextract cpio sleuthkit \
        mkisofs xpdf-utils rpm tar upx-ucl unrar-free unzip 

*** Adjusting the Kernel ***
On modern large memory systems(>4gb), the linux kernel needs to be
adjusted to give postgresql more SysV shared memory.

To set on a running system:
  # echo 512000000 > /proc/sys/kernel/shmmax
To make sure it gets set on boot
  # echo "kernel.shmmax=512000000" >> /etc/sysctl.conf

This number is the number of pages (usually 4k each), and is based on a
fairly complicated formula, please see the postgresql tuning part of the
user guide.

*** Setting up Users and Groups ***
You are expected to already have a "postgres" user as part of the
system postgresql install, and a "www-data" user as part of the
apache2 install.

FOSSology requires a system user and a system group both named
"fossy". The /etc/passwd entries for these user should look
something like (Note: your uid & gid values may be different):

  fossy:x:1001:1001:FOSSology:/srv/fossology:/bin/false

and the /etc/group entry

  fossy:x:1001:fossy

On a system with the useradd and groupadd commands (such as Debian based
systems) you can create the above system user and group with the following
commands as root:
  groupadd fossy
  useradd -c FOSSology -d /srv/fossology -g fossy -s /bin/false fossy

Alternatively, you can use the adduser:
  adduser --gecos "FOSSology" --home /srv/fossology --shell /bin/false fossy

Your system should now be ready for installing FOSSology!
  

Untar the tarball
=================

In the next step, you will need to use files included in the tarball.  If
you haven't done so already, untar the fossology-<version>.tar.gz tarball into a staging
directory.


Preparing Postgresql
====================
Your postgresql install should be configured and running. If you need
help doing that consult the user documentation at
http://www.postgresql.org/docs/manuals/. If you are using SSL in
particular see the section Secure TCP/IP Connections with SSL to set it up.

Edit /etc/postgresql/<version>/main/postgresql.conf:
The tuning and preferences in the config file will vary depending on your
installation.  Here's the results of a diff between the default config file
and the one we use:


> #hba_file = 'ConfigDir/pg_hba.conf'	# host-based authentication file
> #ident_file = 'ConfigDir/pg_ident.conf'	# IDENT configuration file
> #external_pid_file = '(none)'		# write an extra pid file
> listen_addresses = '*'
> max_connections = 50
> #shared_buffers = 1000		# min 16 or max_connections*2, 8KB each
> shared_buffers = 32768
> work_mem = 10240  
> max_fsm_pages = 100000		# min max_fsm_relations*16, 6 bytes each
> fsync = off   
> full_page_writes = off		#recover from partial page writes
> commit_delay = 1000 
> effective_cache_size = 25000
> log_min_duration_statement = -1	# -1 is disabled, 0 logs all statements
> #log_line_prefix = ''			# Special values:

Setting up the database:
A sample database schema file can be found in the tarball under
fossology-<version>/setup/fossologyinit.sql.  This will create the database when
it is read in.  It will also create an owner for the database "fossy"
and owner's password "fossy".  For security reasons, please change this
password, after the database is created, using the alter command.
Detailed steps follow:

Check to see the database server is running
1) Ensure that postgresql is running using this command:
   /etc/init.d/postgresql-8.1 status
   Version Cluster   Port Status Owner    Data directory                     Log file
8.1     main      5432 online postgres /var/lib/postgresql/8.1/main       /var/log/postgresql/postgresql-8.1-main.log

If the status reported is "online", you can proceed to the next step.  If the 
status is "down", use this command to start postgres:
   /etc/init.d/postgresql-8.1 start

You should see the following message:
   Starting PostgreSQL 8.1 database server: main.


2) The default database and owner is created by reading in a file of sql 
statements to define the database schema. This file is included in the tar 
bundle as fossology-<version>/setup/fossologyinit.sql.  You must be logged in as user 
postgres to create the database schema and define the owner.  As root, you
can su to user postgres (su postgres) and then run the psql command:
   psql  <fossology-version/setup/fossologyinit.sql

3) If any steps fail, check the postgres log file for errors:
   /var/log/postgresql/postgresql-8.1-main.log

4) Make sure /etc/postgresql/8.1/main/pg_hba.conf is configured correctly 
to allow your connection.  This file controls: which hosts are allowed to
connect, how clients are authenticated, which PostgreSQL user names they 
can use, which databases they can access.  As a starting point, you will
need something like the following for local connections:

    # local      DATABASE  USER  METHOD  [OPTION]
      local       all      all   md5

   See http://www.postgresql.org/docs/current/static/client-authentication.html
   for detailed information.

5) After editing the pg_hba.conf file, restart the postgres server:
     /etc/init.d/postgres<version> restart

6) Once the database is defined, verify connection with
     psql -d fossology -U fossy

   use the default password "fossy".  You should connect and see the following:
     Welcome to psql 8.1.9, the PostgreSQL interactive terminal.

     Type:  \copyright for distribution terms
            \h for help with SQL commands
            \? for help with psql commands
            \g or terminate with semicolon to execute query
            \q to quit

     fossology=>

7) If any steps fail, check the postgres log file for errors:
   /var/log/postgresql/postgresql-8.1-main.log

Configuring PHP
===============
Some php config variables need to be adjusted for FOSSology. Edit your
php.ini file for apache (location dependent on how you installed, but probably
something like /etc/php5/apache2/php.ini) and make the following
changes by locating where the variable is set and adjusting:

  max_execution_time = 90
  memory_limit = 128M
  error_reporting  =  E_ALL & E_STRICT
  display_startup_errors = On
  log_errors = On
  log_errors_max_len = 0
  error_log = /var/log/php5.log
  post_max_size = 700M
  upload_max_filesize = 701M
In the "[soap]" section add
  extension=pgsql.so

You should also edit /etc/php5/cli/php.ini to include the fossology directory
in the php command line interface path:
  include_path = ".:/usr/local/share/fossology/www"


Configuring Apache
==================
1) You need to add something like the following to the apache config,
  and this will depend on
 A) How you have apache configured, you might be adding it to the
   "default" site (/etc/apache2/sites-available/default) or some other
    config you have setup.
 B) The path you want the FOSSology UI to appear on the server, this
   example uses "/repo/"
 C) Where your FOSSology is installed, this example assumes the default
   local sysadmin share prefix of /usr/local/share/

========================================================================
	Alias /repo/ /usr/local/share/fossology/www/
        <Directory "/usr/local/share/fossology/www">
                AllowOverride None
                Options FollowSymLinks MultiViews
                Order allow,deny
                Allow from all
		# uncomment to turn on php error reporting 
		#php_flag display_errors on
		#php_value error_reporting 2039
        </Directory>
========================================================================
NOTE: included in the above example are some commented lines used for
enabling php error reporting. If you are having problems you might
choose to enable these to help determine the problem. Normally you
probably want them turned off so they don't report confusing error
messages to your end users or reveal information about your system
configuration to potential attackers.

2) Because this software dynamically generates web pages based on the
   database, you may want to tell web robots not to index pages.  You
   can do this with a robots.txt file in your DocumentRoot.  Here is
   a sample that tells all agents to ignore your /repo urls:
========================================================================
    User-agent: *
    Disallow: /repo
========================================================================

Once you have installed the configuration you can test it by running
(as root):
  apache2ctl configtest
and if it tests ok, then you can restart the server with the new config
by running (as root):
  apache2ctl graceful
Note: the site won't work yet until we install FOSSology below.

Building FOSSology
==================
FOSSology uses a system of variables to control where the build will
install things. These are documented and set in Makefile.conf at the
top of the source tree. The defaults are all set for a standard
local sysadmin install, but if you prefer other locations or are
building distribution or 3rd party packages you can adjust to meet your
needs.

NOTE: some commands are required to be run as the root user and are
indicated so. You should be able to use sudo, "su -", a normal root
login, or your favorite root-obtaining utility.

1) cd into the fossology directory.

2) To build the software run "make" followed by "make install" in the 
   fossology directory.   
   "make install" creates an install directory and populates
   it with the files that will be installed on your system.  It also creates
   an install.sh file (the install script) to perform the installation.

3) Run the install script.
   You must run the script as root:  ./install.sh -f
   The install script copies executables to their system directories,
   populates the database with license data, and creates default
   default configuration files. You should see this final 
   message:

     # Checking configuration files
     Creating configuration /usr/local/share/fossology/dbconnect/fossology
     Be sure to configure /usr/local/share/fossology/dbconnect/fossology for your environment.
     The file /usr/local/share/fossology/repository/RepPath.conf points to the repository.
       The repository can be a mount point, and should be a large disk space.
       If you plan to process ISOs, then consider a terabyte of disk or larger.
     You should check /usr/local/share/fossology/agents/scheduler.conf and
     configure it to match your environment.
     You should check /usr/local/share/fossology/agents/proxy.conf and
     configure it to match your environment.
     All files and configuration files have been placed on the system.
     However, some configuration files needed to be created.
     Check the configuration in the following files and directories:
         /usr/local/share/fossology/dbconnect/fossology
         /usr/local/share/fossology/repository
         /usr/local/share/fossology/agents/scheduler.conf
         /usr/local/share/fossology/agents/proxy.conf
     Then re-run this script.

   The default configuration files which MUST BE EDITED.  This is described
   in the following step.

3) Edit the configuration files.
   The first time the install script runs on your system, it will create 
   some default configuration files.  These files control database 
   connections, define proxies (for http, https, ftp) and specify the 
   location of the repository.  The files MUST be modified to run in your 
   environment.  

    /usr/local/share/fossology/dbconnect/fossology
    This file is used by the UI to establish database connectivity.  The
    format is:

	dbname=<dbname>;
	host=<host running postgres server>;
	user=<owner of db>;
	password=<password for owner>;

    /usr/local/share/fossology/repository/Hosts.conf
    This file allows you to separate the repository across multiple systems
    for load balancing.  The default is to create the entire repository on
    the localhost.  See the developers docs at http://fossology.org/docs
    for a discussion on how to spread the repository across multiple hosts.

    /usr/local/share/fossology/repository/RepPath.conf
    This file specifies the path to the repository.  If you use multiple
    systems for load balancing, then their directories should be mounted
    under the repository directory.

    /usr/local/share/fossology/repository/Depth.conf
    This file specifies the depth of the repository tree.  ("3" should be
    deep enough for most needs.)

    /usr/local/share/fossology/agents/scheduler.conf
    This file is used by the scheduler for configuration.
    The host names listed in Hosts.conf could match the host= strings
    here. See the developers document 
    http://fossology.org/docs/scheduler#configuring_the_scheduler for a 
    discussion on how to specify the number of processes per host 
    and agent definitions.  You may also use the utility script, 
    /usr/local/fossology/agents/mkconfig, to generate the scheduler.conf
    file.

    /usr/local/share/fossology/agents/proxy.conf
    This file identifies proxies for using wget on your system.

4) After editing/verifying the configuration files, you will need to re-run
   the install.sh file again (as root):
	./install.sh -f

   You should see this final message:

     # Adding user www-data to group fossy
     # Automated installation completed
     Be sure to:
       + Install PHP5
       + Configure your web server so /usr/local/share/fossology/www
         is used for the user interface.
       + Add your web server user to group fossy in /etc/group
         and restart your web server.
       + Double check the configuration in the following files and directories:
         /usr/local/share/fossology/dbconnect/fossology
         /usr/local/share/fossology/repository
         /usr/local/share/fossology/agents/scheduler.conf
         /usr/local/share/fossology/agents/proxy.conf

5) Run the check.sh script as root to check the configuration for the
   fossology scheduler:
     ./check.sh -S

   You should see this message from check.sh:
     Checking build dependencies...
     Checking installed files...
     Checking configuration files...
     Checking run-time requirements...
     STATUS: All scheduler jobs appear to be functional.
     Check completed.
     No problems detected!

   If missing/incorrect dependencies, installed files or configuration
   files are reported, fix these errors and re-run check.sh till all
   issues are resolved.

6) Start the scheduler
   NOTE:
   The user starting the scheduler must be in the fossy group.  Use
     addgroup <user> fossy 
   as root to add users to the fossy group.

   Check to see that you are in the fossy group using the command "groups".
   As root, start the scheduler with:
     /etc/init.d/fossology start

7) Restart the Apache server
     apache2ctl graceful

8) Browse to http://<http_server>/repo/ to login (as "Default User") and 
   initialize the UI via the Admin -> Initialize menu item.

9) You are now ready to use FOSSology!  See 
   http://fossology.org/user_documentation for a description of the UI.

Install Complete
================
Congratulations, FOSSology is now installed!
For extended discussion on how to use or tune the software, please see
the User Documentation, available on the website, in the doc/ directory
of the source, or the doc/ directory on an installed system (directory
may vary and in in the case of software packages, may be in a different
package you need to install). FIXME: need to make sure this gets done

