Releases: exomiser/Exomiser
New Java version, new database format, smaller data downloads, more ACMG categories, better reporting...
- Minimum Java version is now Java 17
- Update database format REQUIRES DATABASE VERSION 2402 - these are significantly smaller than the previous versions (~50-60% of previous size) See the GitHub discussions section for details.
- Added new GeneBlacklistFilter #457
- Add new ClinVar conflicting evidence counts in HTML output #535
- Added PS1, PM1, PM5 categories to ACMG assignments
- Altered reporting of InheritanceModeFilter to state that the number shown refers to variants rather than genes.
- Updated gene constraints to use gnomad v4.0 data.
- TSV genes, TSV variants and VCF outputs will only write to a single file where the possible modes of inheritances are now shown together rather than split across separate files.
- Fix for issue #531 where the
priorityScoreFilter
andregulatoryFeatureFilter
pass/fail counts were not displayed in the HTML. - Fix for issue #534 where variant frequency and/or pathogenicity annotations are missing in certain run configurations.
- Fix for issue #541 where logging to /tmp/spring.log causes clashes in shared user environments.
- TSV output column
CLINVAR_ALLELE_ID
has been changed toCLINVAR_VARIANT_ID
to allow easier reference to ClinVar variants.
Full Changelog: 13.3.0...14.0.0
MT codon tables and Bayesian ACMG
- Updated Jannovar version to 0.41 to fix incorrect MT codon table usage #521
- Downgraded PM2 - PM2_Supporting for variants lacking frequency information #502.
- Updated Acgs2020Classifier and Acmg2015Classifier to allow for PVS1 and PM2_Supporting to be sufficient to trigger LIKELY_PATHOGENIC
- Updated AcmgEvidence to fit a Bayesian points-based system #514
- Removed ASJ, FIN, OTH ExAC and gnomAD populations from presets and examples #513.
- Fix for regression causing
<INV>
variants to be incorrectly down-ranked - Fix for issue #486 where VCF output includes whitespace in INFO field.
- Logs will now display elapsed time correctly if an analysis runs over an hour (!).
Full Changelog: 13.2.1...13.3.0
SV `<INS>` bugfix
This is a bugfix release to address the blanket scoring of <INS>
variants with a variant score of 1.0. The fix should increase the accuracy of SV call prioritisation.
- Fix for bug where all
<INS>
structural variants were given a maximal variant score of 1.0 regardless of their position on a transcript. - Added partial implementation of SVanna scoring for coding and splice site symbolic variants.
- Fix for issue #481 where TSV and VCF results files would contain no data when the analysis
inheritanceModes
was empty.
IMPORTANT! This will be the last major release to run on Java 11. Subsequent major releases (i.e. 14+) will require Java 17.
Sometimes it's the little things...
This release adds a couple of minor quality of life features to the CLI and fixes a few bugs.
- New multi-architecture docker images with and without bash #471. These images can be found on https://hub.docker.com/repositories/exomiser
- Deprecated of
output-prefix
CLI option (will be removed in next major version) #469 - Added
output-directory
andoutput-filename
CLI options to replaceoutput-prefix
#469 - Added
output-format
CLI option #471 - Fixed excessive CPU usage and application hang after variant prioritisation with large number of results #479
- Fixed issue #478 where gene.tsv output files are empty when running a phenotype only prioritisation.
- Fixed broken links to OMIM in the phenotypic similarity section of the HTML output #465
- Added gene symbol as HTML id tag in gene panel HTML results #422
Automated ACMG, p-values, simpler output, documentation!
The three new features for this release is the automated ACMG classification of small sequence variants, calculating
p-values for the combined scores and providing new and more interpretable TSV and VCF output files.
- Added new automated ACMG annotations for top-scoring variants in known disease-causing genes.
- Added new combined score p-value
- Added new TSV_GENE, TSV_VARIANT and VCF output files containing ranked genes/variants for all the assessed modes of
inheritance. Note that these new file formats will supersede the existing individual MOI-specific TSV/VCF files which
will be removed in the next major release. See the online documentation for details. - New update online documentation! See https://exomiser.readthedocs.io/en/latest/
- New Docker hub images for CLI and web on https://hub.docker.com/u/exomiser
- Added checks to ensure user specifies genome assembly if user specifies VCF path outside of phenopacket/analysis
- Added
--output-prefix
option to enable output prefix directly on the command line - Updated examples to use the latest recommended settings as per preset derived from 100,000 genomes project
for the latest data, please follow the discussions for announcements: #424
hg38 only configuration bugfix
Bug-fix release. No external changes.
CLI changes
- Bug fix for issue #410 where application fails to start when only specifying hg38 data in
application.properties
Phenopackets and Structural Variants
This release is primarily focussed on enabling simultaneous prioritisation of structural and non-structural variation
with as consistent an API as possible for both types of variation. It also introduces a new API for specifying richer
information about a Sample
based on the v1 GA4GH Phenopacket
This release requires data version >= 2109 and Java version >= 11 (Java 17 recommended).
Until we're able to upload the data to the usual data.monarchinitiative.org/exomiser/latest you can download the data using these links:
2109_hg19
2109_hg38
2109_phenotype
CLI changes
- Minimum Java version is now set to Java 11
- New structural variant interpretation alongside small variants - requires data version 2109 or higher. This has
been tested using Manta and Canvas short-read callers and pbsv long-read caller. - New command line options for more flexible input: --sample --output, --vcf, --batch, --preset --assembly --ped . Run
--help for details - Phenopackets v1.0 can be used to input sample phenotype data
- Added ability to specify proband age and sex in input options either via a phenopacket or the 'sample' format
- Improved MOI disease - phenotype matching with added Orphanet MOIs
- Improved incomplete penetrance calculation when using the ANY mode of inheritance option
- Added a
minExomiserGeneScore
option for limiting the output genes to have a mimimum Exomiser combined score. This is
disabled by default. If enabling it, we recommend using a minimum score of 0.7 - BREAKING CHANGE - JSON output changes
pos
renamed asstart
,chrmosomeName
renamed ascontigName
.
Deletedchromosome
field (usecontigName
). New fields:end
,length
,changeLength
andvariantType
Core API
API breaking changes:
- New target Java version set to 11
Exomiser.run()
now requiresSample
andAnalysis
argumentsAnalysisRunner
interface now requiresSample
andAnalysis
argumentsAnalysis
fieldsvcfPath
,pedigree
,probandSampleName
andgenomeAssembly
moved to newSample
classPedigreeSampleValidator
moved fromutil
into newsample
package- Replaced
SampleIdentifierUtil
withSampleIdentifiers
class - Replaced
SampleIdentifier
withSampleData
Variant
now extendsorg.monarchinitiative.svart.Variant
- see https://github.com/exomiser/svart/ for details- Deprecated
VariantCoordinates
- replaced byorg.monarchinitiative.svart.Variant
VariantEvaluation.getSampleGenotypes()
now returns aSampleGenotypes
class- Changed
VariantAnnotation
from implementingVariant
to implementing newVariantAnnotations
interface - Updated variant coordinates
getChromosome()
,chromosomeName()
,getPosition()
,getRef()
,getAlt()
to
useSvart
contigId()
,contigName()
,start()
,end()
,ref()
andalt()
signatures - Replaced
RsId
withString
type inFrequencyData
constructors and return fromhasDbSnpRsID()
method - Replaced
Contig
class with newContigs
class VariantAnnotator
interface changed toList<VariantAnnotation> annotate(@Nullable Variant variant)
VariantContextSampleGenotypeConverter.createAlleleSampleGenotypes()
method now returns aSampleGenotypes
objectVariantFactory
now a@FunctionalInterface
with acreateVariantEvaluations()
VariantFactoryImpl
now requiresVariantAnnotator
andVcfReader
input argumentsVcfCodecs
now requiresList
rather thanSet
inputs
New APIs:
- New protobuf schemas for
Job
,Analysis
,Sample
,OutputOptions
- New
Exomiser.run(JobProto.Job job)
entry point - New
FluentAnalysisBuilder
interface implemented byAnalysisBuilder
and newAnalysisProtoBuilder
for consistent
API between proto and domain classes - New
AnalysisGroup
class extracted fromAbstractAnalysisRunner
- New
Sample
class to encapsulate data about the sample, such asAge
andSex
- New
Age
class - New
Phenopacket...
classes for reading and converting sample data from v1 phenopackets - New Proto converter classes
- New
SampleIdentifiers
class - New
SampleData
class to containsampleIdentifier
,SampleGenotype
andCopyNumber
- New
SampleGenotypes
class to handle - New
CopyNumber
class for handling copy number variation data from VCF - New
AbstractVariant
class - New
VariantAnnotations
interface - New
AlleleCall.parseAlleleCall()
method - New
Pedigree
justProband(String id, Individual.Sex sex))
andanscestorsOf(Pedigree.Individual individual)
methods - New
SvFrequencyDao
,SvPathogenicityDao
andSvDaoUtil
classes - New
VariantWhiteListLoader
class - New
JannovarAnnotationService.annotateSvGenomeVariant()
method - New
JannovarSmallVariantAnnotator
class - New
JannovarStructuralVariantAnnotator
class - New
TranscriptModelUtil
class - New
VcfReader
interface withVcfFileReader
andNoOpVcfReader
implementations - New
VariantContextConverter
class for convertingVariantContext
objects intoVariant
Other changes:
- Updated Spring Boot to version 2.5.3
- Updated Jannovar to version 0.30
- Updated HTSJDK to version 2.24.1
AnalysisResults
now hold references to originalSample
andAnalysis
objectsGenomeAnalysisService
can now return aVariantAnnotator
objectGenomeAssembly
now wraps twoGenomicAssembly
objects- Added
ClinVarData
starRating()
andisSecondaryAssociationRiskFactorOrOther()
methods - Added DBVAR, DECIPHER, DGV, GNOMAD_SV and GONL SV
FrequencySource
- Updated
VariantEffectPathogenicityScore
to becomefinal
and added default inversion score - Numerous small changes to improve performance.
Discovering the ID
This point release is compatible with the 1902, 2003 and 2007 data releases. We recommend you check for the latest data update at https://data.monarchinitiative.org/exomiser/latest/ to keep Exomiser functioning optimally with the latest data.
New features:
- The JSON output now shows the id of the variantEvaluation taken from the VCF file.
New APIs:
- Added
VariantEvaluation.getId()
andVariantEvaluation.Builder.id()
methods to store VCF id field contents.
Unifying the disease types
Up to eleven and one more - new pathogenicity scores and a variant whitelist
CLI changes
This release contains significant diagnostic performance improvements due to the inclusion of a high-quality ClinVar whitelist and 'second generation' pathogenicity scores.
- Added new
PathogenicitySource
sources -M_CAP, MPC, MVP, PRIMATE_AI
. Be aware that these may not be free for commercial use. Check the licencing before use! - Added new variant whitelist feature which enables flagging of variants on a whitelist and bypassing of
FrequencyFilter
andVariantEffectFilter
. By default this will use ClinVar variants listed asPathogenic
orLikely_pathogenic
and with a review status ofcriteria provided, single submitter
or better. See https://www.ncbi.nlm.nih.gov/clinvar/docs/review_status/ for an explanation of the ClinVar review status.
n.b. This release is incompatible with data release 1811 and below.
Core API
API breaking changes:
- Removed FREQUENCY_SOURCE_MAP from FrequencySource
- Changed
Frequency
,RsId
andPathogenicityScore
staticvalueOf()
constructor toof()
- Removed deprecated
IntervalFilter.getGeneticInterval()
- Changed visibility of
PhenodigmMatchRawScore
from public to package private and made immutable - Changed visibility of
CrossSpeciesPhenotypeMatcher
from public to package private and added staticof()
constructor - Replaced redundant
Default*DaoMvStoreProto
classes with newAllelePropertiesDaoMvStore
- Added
OntologyService
as constructor argument toAnalysisFactory
,AnalysisParser
andAnalysisBuilder
- Replaced
BasePathogenicityScore.compareTo()
method with defaultPathogenicityScore.compareTo()
GeneticInterval
no longer acceptsReferenceDictionary
as a constructor argument
New APIs:
- Added CADD and REMM to data-genome
AlleleProperty
- Moved
JannovarDataSourceLoader
from autoconfigure to core module - Added
AllelePosition.isSymbolic()
method - Added
Variant.isCodingVariant()
method - Added
AnalysisBuilder.addIntervalFilter(Collection<ChromosomalRegion> chromosomalRegions)
method - Added new non-public
FilterStats
class for more accurate filtering statistics - Added new
AllelePropertiesDao
interface - Added new
AllelePropertiesDaoMvStore
implementation - Added new
AllelePropertiesDaoAdapter
to fix issue of Spring cache proxy not being able to intercept internal calls - Added new
HpoIdChecker
class to return current HPO id/terms for an input id/term - Added new
HumanPhenotypeOntologyDao.getIdToPhenotypeTerms()
method - Added new
OntologyService.getCurrentHpoIds()
method - Added new
SampleGenotype.isEmpty()
method - Added new experimental
VcfCodecs
class for de/serialising VCF lines - Added new
JannovarDataProtoSerialiser.loadProto()
method for loading intermediateJannovarProto.JannovarData
- Added new
VariantWhiteList
andInMemoryVariantWhiteList
implementation - Added new
VariantEvaluation.isWhiteListed()
method and relevant builder methods - Added new
JannovarDataFactory
for a simple programmatic API to buildJannovarData
objects - Added new
TranscriptSource
enum - Added new
PathogenicityScore.of()
static factory constructor - Added new
PathogenicityScore.getRawScore()
method - Added default
PathogenicityScore.compareTo()
method - Added new static
PathogenicityScore.compare()
method - Added new
ScaledPathogenicityScore
class - Added new
MpcScore
class - Add new
Contig
class for converting contig names to integer-based id
Other changes:
- Updated Spring Boot to version 2.1.3
- Updated Jannovar to version 0.28
- Updated HTSJDK to version 2.18.2
- Refactored
FrequencyData
to use array-based backing for 5-10% memory usage improvement and lower GC especially when nearing max memory - Refactored
AnalysisParser
to utiliseAnalysisBuilder
directly reducing code duplication - Refactored
AnalysisRunner
classes to to utilise newFilterStats
class - Refactored
QueryPhenotypeMatch
to store and return input queryPhenotypeMatches argument - Refactored
VariantDataServiceImpl
to use new AllelePropertiesDao - Refactored
VariantDataServiceImpl
for better readability and performance - Added check for obsolete HPO id input in
AnalysisBuilder.hpoIds()
- Re-enabled
PhenixPrioritiser
inAnalysisParser
- Refactored
VariantEvaluation.getSampleGenotypeString()
implementation to useSampleGenotype
instead ofVariantContext
- Refactored
VariantEffectCounter
internals withVariantEvaluation
calls in place ofVariantContext
- Enabled flagging of variants on a whitelist and bypassing of
FrequencyFilter
andVariantEffectFilter
- Changed
DefaultDiseaseDao
to only return diseases marked as having known disease-gene association or copy-number/structural causes - Added range check to
BasePathogenicityScore
constructor - Updated
CaddScore
andSiftScore
to extendScaledPathogenicityScore
- Updated
CaddDao
to use CADD phred scaled score directly - Replaced production use of
ReferenceDictionary
fromHG19RefDictBuilder
withContig
- Added new
PathogenicitySource
sources -M_CAP, MPC, MVP, PRIMATE_AI
. Be aware that these may not be free for commercial use.