Changes between Version 8 and Version 9 of EsgMeetings/Telecon20090922


Ignore:
Timestamp:
24/11/09 12:53:05 (12 years ago)
Author:
spascoe
Comment:

Page moved to go-essp wiki

Legend:

Unmodified
Added
Removed
Modified
  • EsgMeetings/Telecon20090922

    v8 v9  
    1 = ESG Technical Telcon. 2009/09/22 = 
    2  
    3 [[PageOutline]] 
    4  
    5 == Objectives == 
    6  
    7 To touch base and resolve what ever issues we can be within an hour.  To highlight what we can address at GO-ESSP and what will need addressing after that. 
    8  
    9 == Agenda == 
    10  
    11  * Data nodes 
    12    1. Progress on installing software at BADC/DKRZ 
    13    2. Partitioning between CMIP5 core data and CMIP5 non-core data.   
    14      - Are CORE and non-core data going to be served from the same data-node instance? 
    15      - How does esgpublisher differentiate core and non-core? 
    16      - When will the data be divided between core and non-core and by whom (datanode or modelling centre)? 
    17      - How should this be laid out on the filesystem? 
    18    3. What metadata is created by esg-publisher: THREDDS XML, CDML, NCML? (this is relevant for planning METAFOR activities). 
    19    4. Over what protocol is data sent to the Gateway by esg publisher? 
    20  
    21  * Replication 
    22    1. Progress on supporting replication in esg-publisher 
    23    2. How will Gateways be aware of different replicas? 
    24    3. Data flow. 
    25      - Who will replicate from whom?   
    26        - Will all replication nodes replicate from PCMDI or will PCMDI replicate some datasets from BADC/DKRZ? 
    27      - What is the *human process* for replication?   
    28        - How will we know what needs replicating?   
    29        - Will the replicator PUSH or PULL the data 
    30  
    31  * Gateways 
    32    1. Is BADC/DKRZ going to deploy their own gateway?  If so is that 1 European gateway or two. 
    33    2. If so when will the Gateway software be ready for test deployment at BADC/DKRZ? 
    34    3. What is the MyProxy configuration -- one my-proxy per gateway or one per node?     
    35   
    36  * Bulk Data Movement 
    37    1. Status -- when will ESG nodes be able to try this software? 
    38    2. Can we try using gridftp between nodes before the full BDM system is ready? 
    39     
    40  * Versioning 
    41    1. This was part of the original requirements capture but doesn't appear to be part of the current ESG design.  Will ESG support any notion of dataset version?  What happens when data is updated? 
    42  
    43  
    44 === To discuss but cover in detail at GO-ESSP === 
    45  
    46  * System testing 
    47    1. When should the data node installations be ready to do some initial tests? 
    48    2. What is the minimal data node software requirement for intial IPCC data delivery? 
    49    3. What should be tested between parteners, in which order? 
    50      - data transfer --> bandwidth etc. 
    51      - metadata exchange --> oai, thredds catalog ? 
    52   
    53  * Overall deployment and data timeline. 
    54    1. Are we on course for Dec 2010 opening of the archive?    
    55    2. We need to fit interim milestones into this.  E.g.: 
    56      - Data nodes publish test datasets to Gateways 
    57      - Test inter-gateway metadata harvesting 
    58      - Test dataset replication 
    59      - Receipt of datasets from data centres 
    60      - Replication of actual datasets 
    61  
    62 == Minutes == 
    63  
    64  Attendees:: Luca Cinquini, Franc, Dean Williams, Arie Shoshani, Rachana, Alex Sim, Neil Miller, Gavin Bell, Roland, Bryan Lawrence, Phil Kershaw, Anne Chervenak, Bob Drach, Michael Laugenschage, Stephan Kindermann, Craig Ward, Karl, Pauline, Don Middleton, Cecilia, V Balaji. 
    65  
    66 === Summary of GO-ESSP Monday agendameeting === 
    67  
    68  * BL: Summarised meeting just prior to this one to set the agenda for GO-ESSP Monday.   
    69    * Focus on organisational issues, data flow, timeline 
    70    * Address Replication in detail 
    71    * tackle some technical issues 
    72  
    73 === Recent progress on installing ESG Data Node === 
    74  
    75  * Gavin Bell has provided an installer 
    76  * Stephen Pascoe (SP) now believes that all the software should now be in place at BADC 
    77  * Haven't published anything yet. We want to publish it to the PCMDI gateway. 
    78  * Bob Drach (BD) needs to be informed when ready to publish as a few manual issues need addressing. 
    79  * Stephan (DKRZ) has basic software in place. 
    80  
    81 === ESG Publisher === 
    82  
    83  * SP: My understanding is that ESG Publisher scans the NetCDF metadata and converts to NcML. 
    84  * BD: The full metadata is in the database. Thredds XML catalogues are created (some of which may be NcML). The Thredds catalogue provides aggregation function. Much of the CIM/Curator metadata will come from elsewhere. 
    85  * BD: It needs ports 80 and 443 open. 
    86  
    87 === Core and non-core data === 
    88  
    89  * SP: How does data get defined as "core" or "non-core". 
    90  * Dean Williams (DW): We will know more about that by the end of September. 
    91  * Karl Taylor (KT): "Standard output" is everything on the CMIP5 web site. It may not be feasible to provide it all, therefore it may be split into a subset called "core" and anything outside of that will be "non-core". 
    92  
    93 === Replication === 
    94  
    95  * SP: Have there been more developments on replication? 
    96  * Anne?: We have been working on the basics of replication. We still have resolve some of the complex issues such as who is replicating from whom. We are working on the assumption that the replication mechanism will be pull-based. Alternatively there could be an automated system. We can discuss this in more detail at GO-ESSP. 
    97  * ??: We would like to do tests with the Bulk Data Mover in October.  
    98  * Replication can be considered in 2 parts: correct metadata and shipping data. 
    99  * Need to have a big discussion on replication 
    100  
    101 === Gateway === 
    102  
    103  * SP: We will definitely deploy a Gateway. 
    104  * DW: You need one MyProxy node per Gateway. 
    105  * PK: Think of the Gateway as a MyProxy provider. 
    106  
    107 === Versioning === 
    108  
    109  * How will it be done? 
    110  * DM: I think we need versioning capability. 
    111  * BD: Initial design has been done: an API spec. exists.  We need to decide whether versioning happens at the dataset or data (file) level. 
    112  * BL?: How does this fit with Data Reference Syntax? 
    113  * LC: The Gateway software has begun to move away from the concept of an atomic dataset towards a hierarchy. 
    114  * BL: Atomic dataset and hierarchy isn't inconsistent.  The DRS has been agreed and fixed. 
    115  * VB: Yes, DRS is an integral to our workflow at GFDL.  CMOR2 implements the DRS and we need it to be fixed. 
    116  * BD?: Datanode code can accomodate DRS. 
    117  
    118 === Technical Topics for GO-ESSP Monday === 
    119  
    120  * Replication 
    121  * Versioning 
    122  * Differentiating core and non-core data 
    123  * DRS 
    124  * Security structure 
    125   
     1MOVED TO http://proj.badc.rl.ac.uk/go-essp/wiki/CMIP5/Meetings/Telecon20090922