Notes from the weekly DAS/2 teleconference, 18 Dec 2006 $Id: das2-teleconf-2006-12-18.txt,v 1.3 2006/12/18 18:30:56 sac Exp $ Teleconference Info: * Schedule: Biweekly on Monday * Time of Day: 9:30 AM PST, 17:30 GMT * Dialin (US): 800-531-3250 * Dialin (Intl): 303-928-2693 * Toll-free UK: 08 00 40 49 467 * Toll-free France: 08 00 907 839 * Conference ID: 2879055 * Passcode: 1365 Attendees: Affy: Steve Chervitz, Gregg Helt UCLA: Allen Day, Brian O'connor Note taker: Steve Chervitz Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Agenda ------- * Global Seq IDs * Status reports Topic: Global Seq IDs: ---------------------- gh: re: global seq ids. would like a page in das2 sources as an xml format. then we could have this list in the das2 registry so you could go and look things up rather than just an HTML format. ee: good. them html could maybe be generated out of html gh: yes. now it's a wiki page, makes it complicated. aday: wanted to do a similar thing, automatically process the data. I have a wiki text parser, partially implemented. gh: autogenerates from a template somewhere? aday: a wikitext parser, given a wikitext DOM, you can do what you need to. can generate something from wikitext rather than put xml into the wiki. [A] Allen will help XMLify the das2 global seq IDs wiki page Topic: Status reports ---------------------- gh: new hardware for affy das2 server, formal paperwork done, should get it by end of this week. new year, we'll begin working on setting it up. will allow us to support more organisms, arrays, and genome versions. 32g memory. Other work: last teleconf we talked about serving up affy transcriptome data via das/2. Have talked to Tom Gingeras about getting their data as plots on a das/2 server. still need to consider what formats to support. netCDF? aday: not using now but know how. gh: currently in bar format, but would like to consider moving to another efficient format. a plugin to the das2 server that doesn't hold stuff into memory. too much data. phase 3 whole genome in 5bp resolution, 8 cell lines, 3 replicates. lots of data. indexing scheme to the bar format used internally in transcriptome group. can grab slices of the data quickly. aday: lots of programming overhead to creating a netCDF file. lots of initialization of the matrix you have to do. loading data in, have to be aware of how you created the matrix. can only grow along one dimension. loading needs to be aware of expected growth plan. gh: splicing returns a self-contained bar file for each slice. could be a self-contained netCDF file for each slice. aday: ok. in the order that makes sense for each display. a few dozen lines to get the thing set up. also, must write file to disk, can't deal with it in memory only. didn't like. especially when reading. have to grab it, download to disk, then open it. brian o'connor can elaborate re: client side. I ran into it on server side as well. bo: look in assay package for igb codebase or the hyrax client codebase. haven't been using it for a while. biggest pain is where to write a temp file given the OS/platform. gh: continuing my status: IGB release 3 weeks ago, a few problems with enhancements to read CHP files (affy std for programs that generate expression data files). need to do another release to address these problems (tomorrow expected). Also, caching problems with igb given full URIs for segments, types in das/2 feat queries, file names are now too long for some OSs. used steve's suggestion of md5 digest on it. reverse lookup won't work, but not an issue now. Can now turn back on full uri for types and segments in feat queries, so it should be able to work with biopackages server. gh: will be a different genome per server. need to resolve this (resolution of each genome). Need to move to the proper das/2 approved way. working with affy server to loosen restriction on allowed queries. also some work on editing html docs. ee: igb release. working on igb manual. now several versions out of data. First part: going thru existing manual, fixing things not currently correct. Second part: going thru release notes and incorporating new things. Working on first part now. Second part by later this week. aday: nothing to report. funding did come through to CSHL this week. fine now. gh: sorry for the lapse. sc: fixed 502 gateway error (mod_proxy tweak). HTML retrieval spec docs fixups. wikification of biodas.org with Andreas Prlic. [A] gregg will do his html das/2 retrieval spec fixes by end of week. gh: (regarding the apache complaint about a bad header from the das/2 server). Looking at the http specs, header has: http version, status code, optionally text. that's the only line that doesn't need a colon. maybe i'm not putting http version at the start of the lines. gh: on track for devoting most of Jan to das work? sc: at least half. will know more this week. bo: not much das/2 work, focussing on graduating! worked on biopackages build process. automated build process for building rpms. that has been completed. build farm fc2, fc5, sunos, i386/686. allows us to build binaries for the das server for different target platforms and managing dependencies. can do this for the das/2 server. One server, uses VMware to run various distributions as virtual hosts to do the builds. Other topics: -------------- bo: how as response been to frozen spec? gh: brian G and lincoln are happier. haven't heard from anyone else. would like to hear from folks in England. we haven't advertised it except on the list. top priority was helping brian and lincoln (NCI) statisfied that they can use it. that has gone well. lincoln was reporting 2 weeks ago that things were going well with brian's work. looking for server to serve up hapmap data. has client library in place. was having problems with feat requests, we responded, haven't heard back. Hopefully we'll hear from them next year. [A] Next das2 teleconf: 8 Jan 2007