<ysnp.info>
DOWNLOADS: THE OPEN SOURCE
Y-DNA PROJECT
[HOME]
This is an open source project meaning that all pertinent files now in
development will be linked to in time. More information and updates about
the project can be found at the above link and at the facebook group, Open
Y-Tree.
The following files are intended for use by developers for validation and
use of the Open Y data. Note that FTDNA, YFull, and YBrowse retain full
rights to their data. Downloads are permitted only for informational, not
for commercial, purposes.
- The Open Y haplogroup child/parent
database
From this, paths and whole trees can be constructed. But it's the
haplogroup structure only. Full haplogroups, with its SNP members, is
presently available only at https://ysnp.info/subclades.html. Once the work
on that is done, that database will appear here.
- SNP information extracted from
YBrowse.org
Includes the SNP positions on the Y chromosome, ancestral and derived SNP
values, and their names, including all alternate names.
- CSV file of all multiple SNP names
Labs and other entities register new SNP names at YBrowse.org. Although the
problem now appears to be under control, there are thousands of SNPs having
multiple names. The Open Y project uses the names having the lowest order
numeric portion of the name. For example, S200 is used rather than L448,
despite that the latter is the most-used version. These, of course, cannot
be purged, but it's best that standardized usages be promoted. And note
that there are duplicate representations of some SNP names. It's pointless
to remove them here as they will be downloaded over again every day.
- FTDNA recurrent SNPs
The data is space delimited. The first field is the SNP name followed by the
haplogroups for which it's a member. For more details, see this discussion about the
recurrent SNPs in FTDNA's database.
- YFull recurrent SNPs
As above but for YFull. For more details, see this discussion about the
recurrent SNPs in YFull's database.
- Longhand to shorthand haplogroup
names
This isn't a comprehenisve list. Included are only the longhand haplogroup
names (i.e. A1b1) used by YFull. The second field is the preferred
shorthand (SNP-based) names (i.e. A-PF1260). The conversion is essential
to the Open Y project as its critical for lining up multiple databases.
However, the list is very much open to discussion, even criticism. Any
errors? Have I missed some? Should some be expunged? This is probably the
only data file that is not automatically regenerated every night. It's
"hand-crafted." :) Help me get it right.
- Simple script to parse a child/parent
database.
Explanation is included in the file.
- Open Y haplogroups with member SNPs
This is a space-delimited file in the manner of: SNP Haplogroups. At this
release, there are more than 100,000 haplogroups and more than 660,000 SNPs
with more coming. This version includes all but about 2500 of the FTDNA and
YFull redundundant. They need further parsing.
- Vertical haplogroup tree
A zipped text-based vertical haplogroup (only) tree. Each line leads with
tabs, one for each downstream generation. I've written a tool that creates
an HTML horizontal tree from this format but the result is far too big for a
browser. I'll see what I can do.