Drosophila Melanogaster African Survey

DPGP: Transitional data sets, opportunities for collaboration

DPGP's analysis of 42 Drosophila melanogaster genomes is essentially complete, and a manuscript will be submitted in the very near future. Further questions should be directed to Chuck Langley or Dave Begun.

D. melanogaster genome sequencing has continued at Davis, with a focus on African variation. With the goal of identifying one or two locations in Africa that merit genome sequencing of large population samples, we have initially sequenced a geographically scattered sample of sub-Saharan genomes: a median of three genomes from each of approximately 20 African locations, with a few more to be added. Preliminary analysis of this data motivated us to obtain new, large population samples from Uganda and Zambia (described in a subsequent message). But prior to the sequencing of large samples, the scattered sample itself (which may ultimately contain roughly 80 African genomes) will be one target of analysis, with relevance for understanding the history of the species in its ancestral range, and possibly for detecting local adaptation.

While the new Uganda and Zambia samples were being obtained, a second data set was created, consisting of 27 D. melanogaster genomes from Rwanda. Here the goal was to give the research community access to a somewhat larger sample of African genomes in the short term. The Rwanda genomes (along with most of the data in the scattered sample) are >30X sequence depth (75bp paired-end reads) from libraries of prepared from whole genome amplifications of single haploid embryos (see second message). This data is from 75bp paired-end reads (300 to 400 bp inserts), which will offer new opportunities for studying indels and rearrangements. It is anticipated that the Rwanda genomes and the scattered African genomes will be published separately, with the Rwanda genomes analysis likely to begin first.

Data generation for both of these projects is nearing completion. Sequencing reads will be deposited the NCBI archive. We will generate an assembly for each data set, and share this with the research community. It will probably be at least one month before the Rwanda assembly is ready, and at least two months for the scattered African assembly.

We are interested in pursuing a different model of community involvement with the analysis of these data sets. Rather than composing a "genome paper" on our own and encouraging the community to pursue independent analyses, we would like to involve more of the research community in the initial analysis of both new data sets. Please contact me if you are interested in being a part of this collaboration.

Contact: John Pool (below)

DPGP: Large African samples of melanogaster

This message is to update you on the progress of our recent sampling of D. melanogaster in Africa, and to make you aware of newly available population samples that we want to share with the research community. In July, large samples of isofemale lines were established from:

Uganda (~400 lines from Masindi) Zambia (~420, ~400, and ~85 lines from three separate locations - Livingstone, Siavonga, and Solwezi) South Africa (~240 lines from Phalaborwa in the northeast part of the country) France (~180 lines)

We anticipate sequencing perhaps 100 genomes each from France, Uganda, and Zambia (or possibly South Africa). Based on preliminary sequencing of African genomes (described in my previous message), our hope is that Uganda will provide the best proxy for the source of cosmopolitan (non-sub-Saharan) populations, whereas Zambia may be within the region of highest diversity.

We are unlikely to have the capacity to maintain all of these lines, even in single replicates. Therefore, we are quite interested in hearing whether any of you (or others) would like to receive subsets of these population samples. If you are only interested in receiving the lines that are targeted for genome sequencing, please indicate this (we will be maintaining these lines indefinitely). Also please indicate whether you plan to maintain these samples beyond short-term use.

Data from these lines will consist of one haploid genome from each isofemale line. Haploid genomes are obtained by crossing virgin females from the stock of interest to a sterile males of the genotype ms(3)K81/ms(3)K81. Rare embryos which develop extensively as matriclinal haploids, are then individually collected and their DNA amplified. A standard small insert Illumina library is constructed (one from one haploid embryo from each independent isofemale line) and sequenced to >30X.

The above protocol produces haploid genomes without a need for inbreeding or balancer chromosomes. However, we realize that some analyses would benefit from having the closest possible correspondence between the sequence obtained and the alleles present in a living stock. When practical, we have been inbreeding stocks for five generations prior to sequencing. Ideally, inbred stocks could also be PCR-tested for inversion homozygosity. However, we may not have the person-hours available to conduct inbreeding and inversion testing for several hundred stocks. If your research would benefit from genomes sequences from well-inbred living fly stocks, please let us know if you would be willing to help inbreed a subset of these lines.

To recap, please respond and let me know if: (1) there are fly samples you would like to help maintain (2) you would like to assist with inbreeding / inversion testing.

John Pool (jepool(a)ucdavis.edu)
Postdoctoral Researcher
Drosophila Population Genomics Project
UC Davis