GFF3 Indexer
Find a file
eternal-flame-AD f051c8f7c0
Add TODO
Signed-off-by: eternal-flame-AD <yume@yumechi.jp>
2024-10-14 21:43:14 -05:00
benches init 2024-08-06 19:49:12 -05:00
presentation Upload presentation 2024-08-06 19:49:57 -05:00
src init 2024-08-06 19:49:12 -05:00
.gitignore init 2024-08-06 19:49:12 -05:00
Cargo.lock init 2024-08-06 19:49:12 -05:00
Cargo.toml init 2024-08-06 19:49:12 -05:00
LICENSE add README 2024-10-14 21:40:47 -05:00
README.md Add TODO 2024-10-14 21:43:14 -05:00

GFidx

GFidx is a GFF3 file indexer. It reads a GFF3 file and creates an index file that can be used to quickly retrieve features by ID, attribute, or range.

$ gfidx index -f example.gff3
$ gfidx query -f example.gff3

# CDCA8 - Cell Division Cycle Associated 8
relation ENSG00000134690.11

<...>
chr1    HAVANA  CDS     37692905        37693033        .       +       2       ID=CDS%3AENST00000327331.2;Parent=ENST00000327331.2;gene_id=ENSG00000134690.11;transcript_id=ENST00000327331.2;gene_type=protein_coding;gene_name=CDCA8;transcript_type=protein_coding;transcript_name=CDCA8-201;exon_number=3;exon_id=ENSE00000916824.1;level=2;protein_id=ENSP00000316121.2;transcript_support_level=1;hgnc_id=HGNC%3A14629;tag=alternative_5_UTR%2Cbasic%2CGENCODE_Primary%2Cappris_principal_1%2CCCDS;ccdsid=CCDS424.1;havana_gene=OTTHUMG00000004320.2;havana_transcript=OTTHUMT00000012474.1
Query took: 2.961754ms
34 lines found
Query cost 272.00 KB bytes

# Nucleotide Sugar Transporter Family
trie gene_name SLC35

<...>
chr13   ENSEMBL gene    20612161        20612338        .       +       .       ID=ENSG00000222726.1;gene_id=ENSG00000222726.1;gene_type=snRNA;gene_name=RNU2-7P;level=3;hgnc_id=HGNC%3A42505
Query took: 138.7453ms
2926 lines found
Query cost 22.00 MB bytes

range chr3 650000 1500000

<...>
chr3    HAVANA  gene    1595777 1596245 .       -       .       ID=ENSG00000184423.5;gene_id=ENSG00000184423.5;gene_type=processed_pseudogene;gene_name=RPL23AP38;level=1;hgnc_id=HGNC%3A36351;tag=pseudo_consens;havana_gene=OTTHUMG00000154860.1
Query took: 2.999234ms
243 lines found
Query cost 120.00 KB bytes

TODO

  • Improve Index size
  • HTTP Range requests
  • GUI for exploring GFF3 files