Ticket #609 (closed defect: fixed)

Opened 13 years ago

Last modified 12 years ago

Attribute Authority server forking too soon for PML plugin

Reported by: mggr Owned by: pjkersha
Priority: blocker Milestone: PROD Final
Component: security Version:
Keywords: Cc: mggr

Description (last modified by mggr) (diff)

The Attribute Authority server currently forks shortly after initialising so that it can detach and run as a daemon. This causes the current PML plugin to lose its persistent database connection when the parent exits. Run through of the problem:

  • AA server starts up
  • initialises and calls the PML roles plugin
    • PML roles plugin initialises persistent connection to the database
  • server forks, all file (& socket) handles copied
  • parent server exits
    • "parent" PML roles plugin destroys itself (in the clean up, it closes the database connection (speculation))
  • child server receives a request, makes a role map call
    • "child" PML roles plugin tries query the database, but finds its socket is already closed
    • fails with error:
      <?xml version="1.0" encoding="UTF-8"?>
      <AuthorisationResp>
           <statCode>AccessError</statCode>
           <errMsg>Requesting authorisation: Getting user roles: Error getting 
      roles for user "/CN=mggr/O=NDG/OU=RSDAS": already closed</errMsg>
      </AuthorisationResp>
      

Adding 30 second waits to the parent side of the forks before they exit() allowed requests to a freshly started AA to work correctly until the 30 seconds are up, then you get the above errors again.

In foreground mode (-i) it works fine because there is no parent process that destroys the connection structures.

Some ways to fix this:

  1. change the PML plugin to make it connect lazily/every request or to reconnect if the db connection is down, plus add a bit of documentation warning about fork issues
  2. change the AA to fork before initialising plugins
  3. (cheap) remove daemon mode and, in the init script, just call the AA as "nohup AttAuthorityServer?.py & > $DEVNULL_OR_LOGFILE 2>&1"
  4. change the postgres interface module to one that handles this situation better*, if that's possible

(*We're using the somewhat old psycopg v1.1.21 currently. There's a newer version (psycopg v2) that is actually thread safe. I don't know if this would cope with a destroyed structure due to a forked parent quitting though.) (it doesn't)


Phil mentioned a preference for option 2, but would like to wait to see how the new Twisted AA works before spending time on this (setting milestone to PROD to give plenty of thinking time).

Change History

comment:1 Changed 13 years ago by mggr

  • Description modified (diff)

Tested with psycopg2 - didn't help.

On the other hand (and in the wrong ticket), it did fix a caching problem the existing (unmaintained) psycopg1 module has, so it'd be nice to keep this. Hopefully the next release of security will have a newer SQLObject (current one needs a minor patch to work with psycopg2).

comment:2 Changed 12 years ago by pjkersha

  • Status changed from new to assigned

comment:3 Changed 12 years ago by selatham

  • Milestone changed from PROD to PROD Final

comment:4 Changed 12 years ago by pjkersha

  • Status changed from assigned to closed
  • Cc mggr added
  • Resolution set to fixed

I'm assuming that this is closed now.

Mike has security integrated at PML. At this ticket is dependent on the original security code which used the python http server now superceded with Twisted.

Note: See TracTickets for help on using tickets.