  
                                                           
                                                                
                                                               Original 
                                                               Resources 
                                                          
                                                          
                                                          
                                                        Connecting human disease phenotype to genetic mutation and protein function: A modular data mining short course with an independent project sequence for lecture or lab 
                                                          
                                                         Author(s): 
                                                          
                                                        Janet M. Murray Ph.D., Heather E. Driscoll, and Kara Pivarski 
                                                          
                                                         Overview: 
                                                          
                                                        “Introduction to Data Mining” is a four-session online bioinformatics short course that covers topics including: literature searches, sequence databases, sequence similarity searches using BLAST, multiple sequence alignment, phylogeny reconstruction, protein structure databases, and 3D viewers. Each session is designed to familiarize undergraduate students with online databases and tools for use in scientific study. The module utilizes several online resources, databases, and search engines including: NCBI (NCBI Resource Coordinators, 2016), RCSB-PDB (Berman et al., 2000), Molviz.org (Sayle & Milner-White, 1995), as well as the open source PyMOL software (Schrodinger, 2015). The short course in its entirety is intended for mid- to upper-level undergraduates in a molecular biology, genetics, or biochemistry course. However, the modular design of the online course can be utilized to meet the needs of independent instructors and options are provided to adapt the materials for less advanced students. Although there are many data mining tutorials available, the unique strength of this educational module is the assignment of an independent project that necessitates the use of the data mining tools independently by each student, enhancing student familiarity and competence with the databases and tools that are introduced in the online tutorial (sessions 5 and 6). The resources for these projects are described and can be used separately from the online portion. The short course and independent research projects demonstrate the direct connection between genetic change, protein function, and human (clinical) phenotype. 
                                                          
                                                         Genetics Concept(s) Addressed:  
                                                             
                                                             
                                                        This short course is an introduction to the concepts and principal databases of bioinformatics and structural biology/chemistry. 
														  
                                                         Core Competencies Addressed: 
                                                             
                                                         
														This course should enable students to access and analyze sequence and structure data, explore phylogenetic relationships, create and edit images of protein molecules, generate a hypothesis as to the functional defect of a mutant protein associated with a given human disease phenotype, and present their results in several formats.                                                         
                                                         
                                                            Audience:  
                                                             
															Undergraduates 
                                                       	                                                         
                                                         Activity Type:  
                                                             
                                                            Lab or lecture 
                                                              
                                                         Activity Length:  
                                                             
                                                             
                                                        Six 2-2.5 hour classes 
                                                           
                                                        Citation: 
                                                           
                                                        Murray, J., Driscoll, H. E., and Pivarski, K. (2018). Connecting human disease phenotype to genetic mutation and protein function: A modular data mining short course with an independent project sequence for lecture or lab. Genetics Society of America Peer-Reviewed Education Portal (GSA PREP); 2018. 005; doi: 10.1534/gsaprep.2018.005 
                                                          
                                                           
                                                         
                                                           
                                                          Murray et al 2017 Final 
                                                           
                                                          Murray et al 2017 Supplement 
                                                           
                                                        
                                                         |