Barleymap was designed to search the genetic and physical positions of barley markers on the Barley Physical Map (IBSC) and the POPSEQ map. The strong>Morex Genome map was subsequently added in 2017. The current version uses the MorexV3 genome by default.
Barleymap provides three tools to retrieve data from the maps:
- "Find markers": to retrieve the position of markers providing their identifiers.
- "Align sequences": to obtain the position of FASTA sequences by pairwise alignment.
- "Locate by position": to examine specific loci by map position.
The "Find markers" tool allows searching for loci which are commonly used by the barley community. These loci include genetic markers, genes, BAC contigs, WGS contigs, etc. from different datasets. Their map positions have been previously computed and stored, so that the users can retrieve them by providing the identifier of the locus.
Be aware that "Find markers" datasets were generated using fixed parameters. In those cases when the user wants to perform a more specific search, e.g. by choosing the alignment tool or parameters, it is recommended to get the FASTA sequence of the query and use the "Align sequences" tool instead.
As input data, the user must provide a list of identifiers to use as queries. Besides that, the user needs to choose which is the map (or maps) from which to obtain the positions, using the selection list "Choose maps".
The "Genes/Markers enrichment" area allows the user to customize which additional data will be output along with the map positions of queries. First, the user can choose whether to show "genes", "markers" and/or "anchored". The last usually refers to WGS contigs, BAC contigs, or other elements associated to map positions (anchored), but which lack a biological meaning per se. Besides that, the user can also choose whether to "show only main features" for each map. For example, for Morex Genome, "HORVU" genes are configured as "main" whereas "MLOCs" are not. The "Add features" option involves 2 ways to add additional data to the results:
- "on markers": the additional data is searched for each marker independently. Each additional row is appended after the query position.
- "on intervals": the additional data is searched in the regions defined by all the queries. Each additional row is added only once, and in its actual position in relation to the queries.
Other parameters include whether to show or not markers with multiple mappings, whether to sort the output by centimorgans (cM) or basepairs (bp), and an option to send the results to an email address provided by the user. Note that the option "Sort by" will be applied only for IBSC2012, which has both cM and bp positions available. POPSEQ data will be always sorted by cM, and Morex Genome data by bp.
Datasets included in Barleymap web
The next is a list of datasets whose map positions have been pre-computed and stored in this instance of the Barleymap web application. Note that the standalone version or a custom web version of barleymap could be used to create other datasets.
BOPA1 dataset: bears 1,536 sequences.
"BOPA consensus" (e.g.: 11_20003) or
"POPA12" identifiers must be provided (e.g.: ABC09016-2-2-348, 7174-365, BOPA1_7174-365, ...).
A full list of markers, different identifiers and their sequences can be found at  (supplementary Table S9).
BOPA2 dataset: bears 1,536 sequences.
"BOPA consensus" identifiers must be provided (e.g.: 12_31342, i_12_31342, BOPA2_12_31342).
A full list of markers, different identifiers and their sequences can be found at  (supplementary Table S10).
Illumina iSelect Infinium: 7,864 sequences.
Identifiers can be provided in different formats (e.g.: i_11_10882, 11_10882, 6964-414, BOPA1_6964-414, ...).
A full list of markers, different identifiers and their sequences can be found at  (supplementary Table 6).
(Illumina Infinium iSelect technology belongs to Illumina®)
Illumina 50K[*]: 43,078 sequences with positions provided by JHI. Morex Genome and MorexV3 only.
"Illumina 50K" identifiers must be provided (e.g.: JHI-Hv50k-2016-7), but it accepts previous identifiers for
markers from previous datasets (e.g. SCRI_RS_10006).
A full list of markers, different identifiers and their sequences can be found at [*'] (supplementary Table XX).
(Illumina Infinium technology belongs to Illumina®)
- DArTs: 2,000 sequences (e.g.: bPb-3150 or bPb-3150_PUR_f+r, bPb-2614 or bPb-2614_WSU_r).
Sequences for DArTs can be found at [6'].
- DArTseq SNPs: 8,535 sequences (e.g.: 3254894|F|0 or 3254894).
- DArTseq PAVs (SilicoDArTs): 15,526 sequences (e.g.: 3271396|F|0 or 3271396).
NOTE that 1,761 sequences from DArTseq are PAVs and contain SNPs, so that the identifier is the same for both markers.
(DArTsTM and DArTseqTM technologies belong to Diversity Arrays Technology®)
- Oregon Wolfe Barley GBS SNPs: 34,396 sequences (e.g.: owbGBS1162 or owbGBS34926).
A full list of markers their sequences can be found at  (supplementary Dataset S1).
- Haruna nijo cultivar flcDNAs: 28620 sequences (e.g.: AK358336 or AK358336.1).
- HarvEST Unigenes (assembly #36): 70148 sequences (e.g.: U36_70143 or U36_998).
- IBSC2012 genes[*]: 14,923 HC and 19,415 LC genes (e.g.: MLOC_67805).
- IBSC2012 BES[*]: IBSC_2012 and Morex Genome only. More than 400,000 BAC-End sequences (e.g.: HV_MBa0001A01.f.scf).
- IBSC2012 BAC contigs[*]: IBSC_2012 only. 377,144 BAC contigs. (e.g. HVVMRX83KHA0104A24_HVVMRXALLhA0391C07_v16_c28)
- IBSC2012 WGS contigs (Morex, Barke and Bowman)[*]: Barke and Bowman contigs mapped in IBSC_2012 and Morex Genome only. Morex contigs in POPSEQ map also. (e.g. morex_contig_15371, barke_contig_975766, bowman_contig_387623).
- NCBI barley genes[*]: Morex Genome only. 894 sequences (e.g.: AAD02252.1, dhn11, AAF01699.1).
- IBSC2016 genes[*]: Morex Genome only. 39,734 HC and 41,949 LC genes. (e.g.: HORVU1Hr1G000090).
- PGSB genes: MorexV3 only. 35,826 HC and 45,849 LC genes. (e.g.: HORVU.MOREX.r3.1HG0000030).
- BaRT 1.0 gene models: MorexV3 only. 45,619 genes. (e.g.: BART1_0-u00002).
- Entrez CDS: MorexV3 only. 292 sequences (e.g.: BAO51910.1, LEA3).
- centromers: MorexV3 only.
We shall be pleased to add any dataset you suggest to the web application, granted that its use is free and public.
The "Align sequences" tool allows searching the map position of FASTA formatted sequences through alignment. This process is slower than "Find markers", but allows adjusting the alignment parameters as needed and searching for any DNA sequences.
Some of the features of "Align sequences" are:
- Barleymap results are map positions, which may come from different sequence references, which are searched in a pan-genome or multi-reference fashion.
- It allows using different alignment algorithms, what makes possible to search for sequences with and without introns.
- Most of the details of this process are hidden from the user, who is interested only in the map and its map positions.
In "Align sequences" the user can choose different options for the alignment algorithm, under the option "Choose an action".
- cdna: it is the recommended option, specially when all the queries come from sequences which could have introns. For example, those from CDS or from markers produced from RNAseq data. All the alignments are performed using the GMAP aligner.
- genomic: it uses the most popular alignment tool, BLASTN, to perform all the alignments.
- auto: every query is searched with GMAP. For those queries without hits, the search is repeated with BLASTN.
Besides that, Barleymap is able to use 3 different algorithms when searching maps which have more than one database associated to it. The details of how these algorithms work can be found here. Here, just a brief description of the maps and databases included in this Barleymap web application, and the algorithms used on them, is provided.
References included in Barleymap web
- Morex Genome
- POPSEQ map
- IBSC2012 genetic/physical map
- Three WGS assemblies from different cultivars: Morex, Barke and Bowman.
- Morex cultivar sequenced BAC contigs.
- Morex cultivar BAC End sequences.
The Morex Genome is an actual genome assembly. Most of the datasets precomputed in Barleymap web are available for this reference (one exception, the IBSC2012 BAC contigs). The main datasets associated to this physical map are the IBSC2016 HC and LC genes (the "HORVUs"), the Illumina 50K markers ("JHIs", "SCRIs", etc.), the Morex WGS contigs and the NCBI genes.
The POPSEQ map is a genetic map with Morex WGS contigs anchored to it. The main datasets associated to this map are the IBSC2012 HC and LC genes (the "MLOCs"), the Illumina 50K markers ("JHIs", "SCRIs", etc.), and the Morex WGS contigs.
The IBSC2012 genetic and physical map has sequences of different nature anchored to it:
When a search is performed against the IBSC2012 map an "exhaustive" algorithm (see Figure below) is used. First, the queries are aligned against the first reference, using GMAP, BLASTN or both depending on the aligner chosen (see above discussion about parameters of "Align sequences"). For every query with a hit in the reference a map position is retrieved. Those queries without a map position are searched in the second reference. This is repeated until all the queries have a map position or all the references have been used once as reference. The order in which databases are used as alignment target is the same as in the list above.
Locate by position
The "Locate by position" tool allows examining the regions of specific map positions, mainly with the purpose of checking which genes, markers or other loci are present in those regions.
The input data are "tuples", with chromosome (or contig) and position (local position within the chromosome or contig) in basepairs or centimorgans (e.g. chr1H 100200).
All the other parameters are identical to those in "Find markers".
On top of the results page, Barleymap outputs a list of maps selected by the user. He can use the links on that list to navigate to the results of a specific map.
For every map which the user selected, Barleymap shows up to five tables of results:
The first result shown by Barleymap is a graphical representation of the seven barley chromosomes. Queries with map position are shown on top of those chromosomes. Using the magnifying glass button, the user can toggle between complete chromosomes or just the mapped region.
Below the graphical representation is the "Map" table, with the next fields:
- Marker: identifier of the query sequence, either the user supplied value in "Find markers", the FASTA header of the sequence in "Align sequences", or an arbitrary code "chromosome_position" created in "Locate by position".
- chr: chromosome (or contig or equivalent).
- cM: centimorgans position. Only for anchored maps with cM positions (IBSC2012 and POPSEQ).
- bp: basepairs position. Only for anchored maps with bp positions (IBSC2012).
- start: basepairs starting position. Only for physical maps (MorexGenome).
- end: basepairs ending position. Only for physical maps (MorexGenome).
- strand: whether the query aligns to the target strand (+) or to the complementary strand (-). Only for physical maps (MorexGenome).
- multiple positions: whether the current query sequence has more than one different mapping position in the current map.
This field is shown only if the "Markers with multiple mappings" option has been selected.
- other alignments: whether the current query sequence has other alignment targets which lack map position.
At least one unmapped alignment should be found for such query.
Map with markers
The Map with markers table shows the mapping results along with the genetic markers that are located in the same positions (or regions if the search is extended). The table has the same fields as the Map table.
Map with genes
The Map with genes table shows the mapping results along with the genes that are located in the same positions (or regions if the search is extended). The table has all the fields of the previous tables, plus some additional fields, related to functional annotation of genes:
- Gene class: High Confidence or Low Confidence classification.
- Description: human-readable description of the gene.
- InterPro: IPR identifiers for the gene.
- GeneOntologies: GO identifiers for the gene.
- PFAM: Protein Families identifiers for the gene.
Map with anchored elements
The Map with anchored elements table shows the mapping results along with the elements that are located in the same positions (or regions if the search is extended). In this case, they are not genes or markers; anchored elements have map position but often lack biological meaning (e.g. WGS contigs, BAC contigs, etc.). The table has the same fields as the Map and the Map with markers tables.
Unmapped and unaligned markers
In addition to the mapping results, two more tables are shown for each map.
- Unmapped: shows those queries which have an alignment hit (field "Target ID"). Note that queries in this table could still have map position, through a different alignment.
- Unaligned: shows those queries which lack alignment hit (and thus map position).
While Barleymap uses HTTPS by default, we can not guarantee the security of the data used with the web tool.
Should this naïve confidentiality be not acceptable to some users, we would recommend installing the standalone barleymap version, or setting up their own instace of barleymap web version.
This service is available AS IS and at your own risk. EEAD/CSIC do not give any representation or warranty nor assume any liability or responsibility for the service or the results posted (whether as to their accuracy, completeness, quality or otherwise). Access to the service is available free of charge for ordinary use in the course of academic research.
Mascher et al. 2013
Mascher et al. 2017
Close et al. 2009
Comadran et al. 2012
Bayer et al. 2017
Wenzl et al. 2004
Kilian et al. 2012
Poland et al. 2012
Matsumoto et al. 2011
Wu and Watanabe 2005
Altschul et al. 1990
Mascher et al. 2021
Rapazote-Flores et al. 2019