wiki:Software/MSI/ConfigurationTriton

Version 8 (modified by sdonegan, 10 years ago) (diff)

--

Triton Configuration

This page is to describe the configuration of Triton - the machine for deployment of NDG3 services in an operational environment. Triton will replace proglue for ndg "production" services.

The basics : triton.badc.rl.ac.uk (130.246.191.43) ports 80 and 443 have been enabled for the site firewall. Port 5432 (postgres) enabled for internal machines

Deployment Grid

Please fill in your relevant areas as your services are deployed.

Developer Service Deployment status Documentation Version Notes Tested?
Steve D NDG Redirection Service Installed 18/11/09 n/a n/a Installed /usr/local/ndg-redirect Tested OK 20/11/09 - updates triton db fine.
Steve D Discovery Postgres Database installed 13/10/09 n/a n/a Clone of Neptune (AJH) n/a
Steve D Discovery Service ingestion stack Installed 18/11/09 n/a n/a Installed /usr/local/ndg-discovery-ingest (needs to be connected to triton cronjob) standard ingest NO/ ceda feed YES
Steve D Discovery Service OAI info editor Installed 18/11/09 n/a n/a Installed /usr/local/ndg-oai-info-editor - awaiting configuration and update of Security certs etc n/a
Steve D Discovery Service API Installed 11/11/09 n/a n/a Runs in axis war file - compile locally and load to axis using ant buildfile n/a
Steve D Discovery Service front end (milk stack) Installed 17/11/09 n/a n/a Installed and operational in new triton buildout environment ( http://triton.badc.rl.ac.uk/services/discovery) - connects to Triton API (works). Needs properly configuring -i.e. get view working n/a
Steve D Discovery service logging framework n/a n/a n/a n/a n/a
Steve D Discovery Service usage stats API n/a n/a n/a n/a n/a
Steve D DLESE OAI Harvester Installed 18/11/09:DEPLOYED n/a n/a Installed as WAR file in tomcat - same usernames etc as Neptune. Loaded all Provider info n/a
Steve D CEDA ATOM Discovery XML pipeline Installed 18/11/09 n/a n/a Installed in /home/badc/buildouts/atomFeedDocumentIngester - not strictly MSi service but running on Triton! Awiating connection to cron wrapper scripts n/a
* * * * * * *
Phil K Security stuff
* * * * * * *
Dom L COWS stuff
* * * * * * *
Stephen P COWS stuff

Python Configuration

System default is Python 2.5 in /usr/bin. Under SuSE, the site package location is customised to /usr/local/lib64/python2.5/site-packages with /usr/lib64/python2.5/distutils/distutils.cfg.

Application packages will be installed separately to avoid version conflicts and maintenance problems with a single package area. virtualenv or zc.buildout could achieve this. virtualenv is easy to set-up with mod_wsgi - see Apache Configuration. zc.buildout enables overriding control over package versions to define a package and version combination to make a stable deployment. zc.buildout  collective.recipe.modwsgi enables integration with mod_wsgi. zc.buildout is currently the preferred means of configuration (17/06/2009).

Apache Configuration

Apache on Triton has been built from source (not the standard Open-SUSE distribution). The base installation directory is /usr/local/apache2. The main configuration file is in the conf directory with all virtual hosts defined in conf/extra/httpd-vhosts.conf. The Apache user is defined as user "wwwrun" group "www" in httpd.conf.

The document root for all html files and web served content is /usr/local/apache2/htdocs. WSGI script files for public services is /usr/local/apache2/wsgi_scripts_public (in your buildout use this as the target directory).

Restart apache: sudo /etc/init.d/apache2 restart

Postgres DB configuration

Buildouts configuration info

OAI Info Editor configuration & Notes

The NDG OAI Info service allows providers to manage OAI harvests and synchronisation of provider xml with discovery database contents.

The base directory for this service is: /usr/local/ndg-oai-info-editor The buildout configuration (as of 20/11/09 08:21)

#
# zc.buildout config for the NDG Discovery Service
#
# P J Kershaw 02/06/09
#
[buildout]
parts = OAI_Info_Editor_NDG3
#develop = passwords

# Configuration mirroring eggs as currently deployed on proglue
[OAI_Info_Editor_NDG3]
recipe = collective.recipe.modwsgi
interpreter = python2.5.1
extra-paths = ${buildout:directory}/ingestConfig, ${buildout:directory}/ingestAutomation-upgrade/OAIBatch

# Versioning:
#
# 1) Explicit Pylons and WebHelpers versions required otherwise code
# breaks with WebHelpers 0.6.4 with:
#
# File "/usr/local/ndg-discovery/eggs/ows_server-0.0.0dev_r5354-py2.5.egg/ows_server/config/environment.py", line 22, in load_environment
# tmpl_options['myghty.escapes'] = dict(l=webhelpers.auto_link, s=webhelpers.simple_format)
# AttributeError: 'module' object has no attribute 'auto_link'
#
# 2) cdat_lite should be fixed at 4.1.2 to imitate the proglue settings
# but this build fails.  Using >=, cdat_lite 5 is installed but there
# are then issues with the code not able to find cdms because of the
# change to cdms2. e.g.
#
# http://ndg3beta.badc.rl.ac.uk/services/view/grid.bodc.nerc.ac.uk__DIF__grid.bodc.nerc.ac.uk-DIF-EDMED1048001
#
# Left as an open issue because discovery service is likely to be upgraded to
# the latest version anyway
eggs =
        ows_common==0.1dev_r2969
        ndgCommon==0.1.1.dev_r5997
        csml==2.1b_r3917
        cdat_lite>=4.1.2_0.2.5
#       cdat_lite==4.1.2_0.2.5
        Pylons==0.9.6.2 # changed from 0.9.6.1 to get oai editor going
        PyGreSQL==3.8.1
        PasteScript
        WebHelpers==0.3.2
        oai_info_editor==0.0.0dev_r5673
        oai_document_ingester==0.1.0.dev_r5976
        Routes==1.7.3
        AuthKit==0.4.3ndg_r174
        ndg_security_server # Comment out whilst 4Suite-XML ftp site is down 17/07/09 PJK
config-file = ${buildout:directory}/secured.ini
find-links = http://ndg.nerc.ac.uk/dist
        http://ndg.nerc.ac.uk/dist/archivedcsml

... & the Makefile for generating buildout..

#
# Makefile to customise WSGI script generated by zc.buildout and add in logging
# capability.  It also installs the script in the correct area for Apache to pick up
# Alteration of the script is done with a series of ugly sed calls.  This could be
# replaced by a customised Pylons buildout recipe to add the logging calls in
#
# P J Kershaw 04/06/09
WSGI_DIR=/usr/local/apache2/wsgi_scripts_public
WSGI_SCRIPT_NAME=OAI_Info_Editor_NDG3.wsgi
WSGI_SCRIPT_IN_FILE=./parts/OAI_Info_Editor_NDG3/wsgi
#TMP_FILE=${WSGI_SCRIPT_IN_FILE}.tmp
TMP_FILE=${WSGI_SCRIPT_IN_FILE}

install_wsgi:
        @echo installing WSGI script ...
        cp ${TMP_FILE} ${WSGI_DIR}/${WSGI_SCRIPT_NAME}
        @echo Done.

http_proxy=http://wwwcache.rl.ac.uk:8080

buildout:
        export http_proxy=${http_proxy}; /usr/local/bin/buildout
        export PYTHON_EGG_CACHE=/usr/local/egg-cache

Note that currently the PYTHON_EGG_CACHE has to be explicitly set in the wsgi file (os.environPYTHON_EGG_CACHE? = '/usr/local/egg-cache') -remember to add "os" to the python import statement!

Note that harvests etc are all written to: /var/lib/wwwrun/`

Information held for all providers is in: /var/lib/wwwrun/oaiInfoEditorData/provider_info_provider_info_data.xml

Security - TBC

Main Discovery Ingest Configuration & Notes

Discovery Service API configuration & Notes

Discovery Service Portal (MILK) Configuration & Notes

NDG URL redirection service

The redirection service allows tracking of all modified urls (altered in the ingest service) - this allows NDG to follow the "through" traffic and work out whats most popular in terms of links and services. The service is comprised of three parts:

  • Discovery Service Ingest URL changer (in utilities.py of the oai_document_ingester egg collection. This has hard coded the base url of the actual web redirection service (in abstractDocumentIngester.py. The rest of the ingest will then generate a redirection url to replace the original url in the ingested xml document. The ingest code will also encapsulate information within the redirect url: the datasetID, datasetName and the original url. This information is creamed off in the redirection service and updated to the url tracking database.
  • NDG Redirection web service: This is a true web service within its own mod_wsgi buildout and running on triton. When a url is clicked on within the discovery service portal this will go via the redirect service which uses the encapsulated information to update the url tracking database and then actually redirects to the true url, giving the user the affect of being taken to the desired url
  • URL tracking database: This is a separate table (urlTracking) within the searchLog database on triton. Everytime a redirection url is clicked, information on the url, dataset id and name are recorded. Code within the redirection service on triton increments a counter for the number of access attempts and only starts a new line for a new url associated with a dataset.

The baseURl for the triton redirection service is http://triton.badc.rl.ac.uk/NDGredirection/ndgURLredirect/redirect?url= (NOTE this really needs to be part of a config file in ingesT!)

For example, this url: http://badc.nerc.ac.uk/data/rapid/

results in this redirect url:  http://triton.badc.rl.ac.uk/NDGredirection/ndgURLredirect/redirect?url=http%3A//badc.nerc.ac.uk/data/rapid&docID=badc.nerc.ac.uk%3ADIF%3Adataent_rapid_SERVICE_TEST&docTitle=NERC%20Rapid%20Climate%20Change%20%28RAPID%29%20programme

The redirection web service buildout is located on triton under /usr/local/ndg-redirection with the generated wsgi script placed in /usr/local/apache2/wsgi_scripts_public.

The buildout.cfg for the triton redirection service (as of 20/11/09 09:53) is:

#
# zc.buildout config for the NDG Discovery Service
#
# P J Kershaw 02/06/09
#
[buildout]
parts = ndgRedirect
#develop = passwords

# Configuration mirroring eggs as currently deployed on proglue
[ndgRedirect]
recipe = collective.recipe.modwsgi
extra-paths = ${buildout:directory}/config/redirect.config

# Versioning:
#
# 1) Explicit Pylons and WebHelpers versions required otherwise code
# breaks with WebHelpers 0.6.4 with:
#
# File "/usr/local/ndg-discovery/eggs/ows_server-0.0.0dev_r5354-py2.5.egg/ows_server/config/environment.py", line 22, in load_environment
# tmpl_options['myghty.escapes'] = dict(l=webhelpers.auto_link, s=webhelpers.simple_format)
# AttributeError: 'module' object has no attribute 'auto_link'
#
# 2) cdat_lite should be fixed at 4.1.2 to imitate the proglue settings
# but this build fails.  Using >=, cdat_lite 5 is installed but there
# are then issues with the code not able to find cdms because of the
# change to cdms2. e.g.
#
# http://ndg3beta.badc.rl.ac.uk/services/view/grid.bodc.nerc.ac.uk__DIF__grid.bodc.nerc.ac.uk-DIF-EDMED1048001
#
# Left as an open issue because discovery service is likely to be upgraded to
# the latest version anyway
eggs =
        ows_common==0.1dev_r2969
        ndgCommon==0.1.1.dev_r5997
        Pylons==0.9.6.2 # changed from 0.9.6.1 to get oai editor going
        PyGreSQL==3.8.1
        PasteScript
        WebHelpers==0.3.2
        Routes==1.7.3
        ndgRedirect==1.0.0dev_r5460
config-file = ${buildout:directory}/development.ini
find-links = http://ndg.nerc.ac.uk/dist
        http://ndg.nerc.ac.uk/dist/archivedcsml

.. and the Makefile:

#
# Makefile to customise WSGI script generated by zc.buildout and add in logging
# capability.  It also installs the script in the correct area for Apache to pick up
# Alteration of the script is done with a series of ugly sed calls.  This could be
# replaced by a customised Pylons buildout recipe to add the logging calls in
#
# P J Kershaw 04/06/09
WSGI_DIR=/usr/local/apache2/wsgi_scripts_public
WSGI_SCRIPT_NAME=ndgRedirect.wsgi
WSGI_SCRIPT_IN_FILE=./parts/ndgRedirect/wsgi
#TMP_FILE=${WSGI_SCRIPT_IN_FILE}.tmp
TMP_FILE=${WSGI_SCRIPT_IN_FILE}

install_wsgi:
        @echo installing WSGI script ...
        cp ${TMP_FILE} ${WSGI_DIR}/${WSGI_SCRIPT_NAME}
        @echo Done.

http_proxy=http://wwwcache.rl.ac.uk:8080

buildout:
        export http_proxy=${http_proxy}; /usr/local/bin/buildout