Skip to main content

Research Repository

Advanced Search

Visual cleaning of genotype data.

Kennedy, Jessie; Graham, Martin; Paterson, Trevor; Law, Andy

Authors

Martin Graham

Trevor Paterson

Andy Law



Abstract

While some data cleaning tasks can be performed automatically, many more require expert human guidance to steer the cleaning process, especially if erroneous or unclean data is a product of relationships between entities. An example is pedigree genotype data: inheritance hierarchies in which the correctness of genotype data for any individual is judged on comparison to their relations’ genotypes, as individuals should inherit DNA from their assumed ancestors. Thus, cleaning this data must consider the relationships between individuals; sometimes this means more data must be cleaned than first assumed, while in other situations it means errors across many individuals can be remedied by cleaning the data of a shared relation. Such judgements require a domain expert to hypothesise the effect changing particular data has on the wider data set. Using a visualization tool with the ability to undertake what-if interactions can assist a user in correctly cleaning such data. We achieve this by closely coupling an existing pedigree visualisation technique, VIPER, with a genotype cleaning algorithm, and then develop necessary extensions to the visualization to allow interactive data cleaning. A comparative user evaluation with biologists shows the advantages of this visualisation design over an existing cleaning tool and we discuss the challenges in the design of visual cleaning tools in which errors may be transitive.

Citation

Kennedy, J., Graham, M., Paterson, T., & Law, A. (2013, October). Visual cleaning of genotype data. Presented at BioVis 2013

Conference Name BioVis 2013
Start Date Oct 13, 2013
End Date Oct 14, 2013
Publication Date 2013
Deposit Date Nov 12, 2013
Publicly Available Date Dec 31, 2013
Peer Reviewed Peer Reviewed
Pages 105-112
Book Title Proceedings of BioVis 2013
ISBN 978-1-4799-1658-0
DOI https://doi.org/10.1109/BioVis.2013.6664353
Keywords Pedigree; data cleaning; genotypes; user evaluation;
Public URL http://researchrepository.napier.ac.uk/id/eprint/6454
Publisher URL http://dx.doi.org/10.1109/BioVis.2013.6664353
Contract Date Nov 12, 2013

Files









You might also like



Downloadable Citations