KGI's GENE 5110
Computing for Insight with Python
Overview/Syllabus, Fall 2022



 


To the KGI GENE5110 home page (hosted at Piazza)

 

 

KGI-Format Syllabus

 

Catalog Description

 

[Motivation]

Genetics, bioinformatics, and the broader health sciences depend deeply on computing. Creative and effective professionals in those fields use computing proactively: they are comfortable and capable across many different computational ecosystems.

[Description]

Specifically, computationally-empowered professionals draw on an experiential foundation of learning and problem-solving in bioinformatics and computing problem-spaces. They have worked in both -- at the same time -- and have worked through uncertainty and novelty from both directions. This class is a one-semester investment in precisely these foundational experiences.

[Student Learning Outcomes]

In this class, every student will (1) read, run, author, and test their own small software programs (scripts). [In recent years, the scripting language used has been Python; we will continue to adapt.] In addition, each student will (2) explore genomic/bioinformatic data and processes through their own, and others', scripts, (3) build familiarity and problem-solving experience across several libraries, environments, and computing traditions which are common across bioscience communities, and (4) investigate, complete, and reflect on a sizable, self-designed computational project -- one that, if desired, can springboard into further, future computational pursuits.

(3 credit hours)


General Information

 

Instructor:   Zach Dodds
Office:  McGregor 326 on Harvey Mudd's campus
Directions to office:  
Enter the McGregor building at the corner of 12th and Dartmouth. Head up the orange spiral stairs to the third floor; 326 is on the northern side.
But - note the Official Office Hours are in the LAC - see below.
Phone:  x71813   (909-607-1813)
Email:   dodds@cs.hmc.edu
Official Office Hours  F 1:30-3:30pm at the Linde Activities Center at Harvey Mudd (here is an HMC map - search for Linde. It's the building at the intersection of Mills and Foothill.
Real Office Hours  Anytime - just email me to set up a meeting...



TAs:   C. Laurel Anderson
Email:   See Piazza!
Official Office Hours   See Piazza!


Is GENE 5110 for you?

 

Absolutely! GENE 5110 is intended as an introduction to programming and problem solving in the python language, and it can be worthwhile even by experienced programmers who want to learn or brush up on a powerful language. It does not assume expreience nor knowledge of programming. A similar course, CS5, is part of the Harvey Mudd core curriculum, and this one is a version tailored to be a technical analog (perhaps homolog?) at KGI. There is a CGU version, as well: For conversion, CGU's IST341 is very similar to KGI's GENE5110, which is very similar to the pair of courses at HMC: CS5 and CS35. Students of a variety of disciplines have also found its skillset and mindset of help to developing insights in their own field.


Date / Time / Place

 

When it's taught as HMC's CS5, there are separate lecture and lab meeting times -- for this graduate elective, we will combine both of those portions into our single weekly meeting KGI. (Saturdays, 12:30-4:00pm).

  • GENE 5110 at KGI: Saturdays at KGI (room/medium tbd) from 12:30-4:00pm
  • Also, you are welcome to come to any of the CS5 lectures and/or labs and/or tutoring hours... These can supplement or expand on our meetings at KGI, and they're totally optional. (See HMC's CS5 pages)


Textbooks and Software

 

There are two free, online textbooks that many students have found useful. That said, we do not "plow through" them in any sense. They are

We will not be working through the book(s) chapter-by-chapter. Rather, in our investigations as to "How to Think like a Computer Scientist," we will use the book as appropriate. We will also may use other materials from HMC's CS5 and CS35:


Grading

 

Your grade in the course will be based on a combination of your homework and projects. The basic intent is that the homeworks will act as learning tools while the final project gives you the chance to explore more open-ended challenges... .


The above table is taken directly from KGI's policy (with thanks to Dean Fortini for pointing this out!) The overall points approximately divided into 800 for homework assignments, 100 for in-class exercises and mastery-practice, and 400 to 450 for the final project.



Homework Assignments

 

There will be weekly homework assignments, due at the following times:

           11:59pm Wednesday evenings
         

These assignments are intended to exercise and solidify your understanding of the week's material, review older material, and look ahead to new topics. They are the most important part of the course!


Getting Help

 

While your work must be your own (or the team's), it is important that you actively seek out help when you are having trouble in the course. Ask questions in the lecture and/or the lab about things in the notes you don't understand. Come to me or consult with your assignment partner with more questions. Don't be afraid to return many times if something doesn't make sense. And an extra pair of eyes is always helpful when debugging.

If your schedule allows it, I would suggest that you work in a team of two or three people: often being able to talk through things with a partner makes the process smoother, more efficient, and more fun!

HMC Labs    Although this doesn't work with everyone's schedule -- and may be affected by distancing requirements/protocols -- there are afternoon/evening labs at HMC (Tuesdays and Wednesdays from 2:30-4:30pm and from 6-8pm in McGregor's CS Labs. There, you'll have a chance to work on homework with student tutors (we call them "grutors"), as well as either me or another instructor (we always have one instructor available during lab). If those do work and you'd like to join, try it!

The labs are one-and-the-same as tutoring hours. See this page for the link to all of the Zoom urls.

Claremont Colleges' tutoring hours    In addition, there are many evening (and weekend afternoon) tutoring hours for the parallel-run CS5 course at HMC. Here is the page to access the schedule and details on "grutoring" hours. Feel free to join in for any of those, if they work for your schedule.

In addition, our TA, C. Laurel Anderson, will send out a survey as to helpful hours to be available - or, drop her a note!

Piazza questions    One reason we use Piazza is that it's so easy to ask questions - please do! If you're sharing code, please make the questions private-to-instructors; otherwise, feel free to make them public, and everyone is welcome and encouraged to contribute.

Programming and CS are unnatural (alas!)    Too often, a student will bang his or her head against a wall for hours trying to figure out why a program isn't working, when a few minutes with the professor or another student is enough to make it clear! Don't spend more than 15-20 minutes on a problem (if no progress is being made)... consider any of the above strategies and always remember, too, that you can drop me an email!



Course Aims and Objectives

  This course has two central aims, each with a number of associated objectives:

  • Aim 1: To give students the tools to take a computational problem through the process of design, implementation, documentation, and testing.

    Objectives:
    • Break a broad problem down into specific subproblems
    • Write an algorithm to solve a specific problem, and then translate that algorithm into a program in a specific programming language (Python)
    • Write clear, concise documentation: evidence of understanding the problem, solution, or workflow at hand.
    • Develop test cases that reveal programming bugs


  • Aim 2: To give students an understanding of the breadth of Computer Science as a discipline and how it exists in the world.

    Objectives:
    • Identify applications of computer science, in one's own discipline and beyond
    • Describe the big questions in computer science, especially as they can support creativity and efficiency in other disciplines
    • Describe the relationship between a number of major sub-disciplines within computer science, especially the ways in which computing can contribute to developing insight in the biological and biomedical sciences.

Academic Integrity, Classroom conduct and inclusion

  Software's value - and challenge

The skill of creating and understanding software from source is a very valuable one. One reason for this value is software's ability to be used and shared widely.

However, using and sharing software is fundamentally different from the skill of creating and understanding it. Because this course is about the skill of creating and understanding software from source, we in HMC CS have in place the following set of academic-integrity guidelines for all of our courses, this one included:

Guidelines and Policy

In short, all solutions and code should be produced by you alone, or by you and a partner for pair-programmed assignments. For pair-programmed assignments, each partner (or member of the pair) must be an equal co-owner of the work. That means that the two of you must be present and working together at the computer for the duration of that problem. It is not acceptable for one person to do some work on the problem when the other is not present and actively participating. (If you can't schedule that kind of interaction, we ask you not to pair-program.)

You may discuss algorithms at a high level with any student that is currently in the class. If you wish to help someone find a bug in their code, it is important that you have already completed and submitted your solution. Moreover, you may not copy solutions from anyone or any place, nor should you collaborate beyond high-level discussions with anyone unless it is your pair programming partner.

Violations of these policies will likely result in failure of the course. In addition, evidence of academic integrity will reported to the appropriate administrator at your home institution, and handled by the rules and procedures there. (After the course, all submissions are run through MOSS, a program that systematically checks for code-similarity across past submissions and other sources. Violations of academic integrity guidelines incur all of the same processes and repercussions, whether found during or after a course.)

Big-picture

For your sake and the sake of the Claremont-Colleges community, please conduct yourself with the highest level of academic integrity.

If you have any questions about what behavior is acceptable, it is your responsibility to contact your professor.

Classroom conduct and inclusion

For all of this class's students, grutors, staff, and instructors, our goal is to create learning environments that are usable, equitable, inclusive and welcoming. If facets of the instruction or design of this course result in barriers to inclusion - either to an individual or a specific cohort group - let your instructor know as soon as possible. If an alternative communications channel is more appropriate, please feel free to contact the CS department chair, the HMC Dean of Faculty (both of whom are wonderfully accepting people), your own Dean(s), or a faculty member at HMC or your own campus, depending on the path you feel most comfortable with.

Content Overview

  In brief, the material GENE 5110 covers is a superset of the following:

Week 0 Introduction to computation: CS, Python, and information-parsing
Week 1 From data to information: strings, structures, and slicing
Week 2 CS's fundamental building blocks: functions and files
Week 3 Self-similarity as design strategy: recursion
Week 4 Top-down vs. bottom-up problem solving: analysis and synthesis of algorithms
Week 5 Computation's building blocks: files and folders
Week 6 Using the web via Python: gathering, parsing, and composing content
Week 8 Python's primary strength: its Libraries, with many supporting the biomedical sciences
Week 9 Python's biomedical and machine-learning ecosystems, Part 2
Week 10 Python's biomedical and machine-learning ecosystems, Part 3
Week 11 Piecing everything together: large-scale problem solving
Week 12 Piecing everything together: final projects
Week 13 Final projects...
Week 14 Final projects...

Remember that GENE5110 splices HMC's CS5 and CS35 (similar to CGU's IST341).

Which is to say, there are many people around Claremont -- a community -- who have built a background in these skillsets and resources. I'm excited for you to join that group! The whole cohort will be happy to help you make computing part of your professional practice.

Welcome!

Here is the homepage for CS35