Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion data/iris.data
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
septal_length,septal_width,petal_length,petal_width,species
sepal_length,sepal_width,petal_length,petal_width,species
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
Expand Down
8 changes: 8 additions & 0 deletions data/orfs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
>orf0
ATGTACGAGACAACCATGCCTACGATTGAGACGAGCGTTGAAGGAAACGAAAGTTAA
>orf1
ATGTACGAGACAACCATGCCTACGATTGAGACGAGCGTTGAAGGAAACGAAAGTTAACAGAGCTTCCCGTAA
>orf2
ATGCCTACGATTGAGACGAGCGTTGAAGGAAACGAAAGTTAA
>orf3
ATGCCTACGATTGAGACGAGCGTTGAAGGAAACGAAAGTTAACAGAGCTTCCCGTAA
202 changes: 50 additions & 152 deletions lessons/lesson_06.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -37,16 +37,18 @@
"metadata": {},
"source": [
"## Functions\n",
"\n",
"- functions are blocks of code that can be referenced and easily reused\n",
"- can utilize parameters (arguments)\n",
"- argument order matters, but can be circumvented by using specific argument names (kwargs = keyword arguments)\n",
"- argument order matters, but can be circumvented by using specific argument names (`**kwargs` = keyword arguments)\n",
"- runs when called\n",
"- functions handle variables isolated from the outside code (usually, **exceptions apply**)\n",
"- can return results via `return`, which is optional\n",
"- defined by the keyword `def`\n",
"- requires a name (same rules apply as for variables, i.e. no digits at the beginning or no spaces)\n",
"- require a name (same rules apply as for variables, i.e. no digits at the beginning and no spaces)\n",
"- never define a function name that overwrites a build-in function (e.g. `print`, `sum`, `len`)\n",
"- can have defaults (defined after the list of arguments)\n",
"- optionally (but good practise), each function has a doc string describing the function's purpose commonly using triple quotes"
"- optionally (but good practice), each function has a doc string describing the function's purpose commonly using triple quotes\n"
]
},
{
Expand Down Expand Up @@ -166,6 +168,7 @@
"metadata": {},
"outputs": [],
"source": [
"# note: type hints are not enforced or checked from python, but often from code editors\n",
"print(new_function([\"a\", \"b\"], \"a\"))"
]
},
Expand Down Expand Up @@ -209,37 +212,6 @@
"print(res)"
]
},
{
"cell_type": "markdown",
"id": "6485cfde",
"metadata": {},
"source": [
"## Additional comments on functions\n",
"- there are a lot of build-in functions like `len()` in Python\n",
"- the complete list can be found here: https://docs.python.org/3/library/functions.html\n",
"- **never define a function name that overwrites a build-in function** (it will mess with your code in unpredictable ways)\n",
"- the same is true for all keywords like `def`, `for`, `in` etc. (though the interpreter will just report an error)\n",
"- the full list of keywords can be found here: https://docs.python.org/3/reference/lexical_analysis.html#keywords"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "92355eff",
"metadata": {},
"outputs": [],
"source": [
"# example overwriting a built in function\n",
"print(sum([1, 2, 3]))\n",
"\n",
"\n",
"def sum(a):\n",
" return a\n",
"\n",
"\n",
"print(sum([1, 2, 3]))"
]
},
{
"cell_type": "markdown",
"id": "ec2e75fa",
Expand Down Expand Up @@ -267,85 +239,35 @@
"print(multiply(**my_dict))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "638ba311",
"metadata": {},
"outputs": [],
"source": [
"my_dict = {\"factor1\": 2, \"factor2\": 3, \"factor3\": 4}\n",
"print(multiply(**my_dict))"
]
},
{
"cell_type": "markdown",
"id": "b6b8f745",
"metadata": {},
"source": [
"## Global vs local variables\n",
"- variables defined within a function are local to that function and cannot be accessed outside of it\n",
"- variables defined outside of any function are global and can be accessed from anywhere in the code, including inside functions (but not modified unless declared as global within the function)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e839327d",
"id": "f28f70c9",
"metadata": {},
"outputs": [],
"source": [
"def function_l():\n",
" s = \"I love Minneapolis!\"\n",
" print(\"Var s inside local function:\", s, \"\\n\")\n",
"\n",
"\n",
"def function_g():\n",
" global s\n",
" s = \"I love Seattle!\"\n",
" print(\"Var s inside global function:\", s, \"\\n\")\n",
"## Handling files\n",
"\n",
"- files are an important concept to store and access information\n",
"- to handle files, Python can read and write human-readable files (more or less) directly\n",
"- for more complex input files (i.e. compressed or binary data files like bam), external libraries are required\n",
"\n",
"s = \"I love NYC!\"\n",
"print(\"Var s outside any function:\", s, \"\\n\")\n",
"<br>\n",
"\n",
"function_l()\n",
"print(\"Var s outside local function:\", s, \"\\n\")\n",
"- to open and close a file automatically, we use the built-in function `open(filename, mode)` in combination with a `with` statement\n",
"- the `mode` argument defines how the file is handled, commonly used are:\n",
"\n",
"function_g()\n",
"print(\"Var s outside global function:\", s)"
"| mode | meaning | comment |\n",
"| :--: | :-----: | :---------------------------------- |\n",
"| 'r' | read | default |\n",
"| 'w' | write | overwrites files |\n",
"| 'a' | append | adds to the end of an existing file |\n"
]
},
{
"cell_type": "markdown",
"id": "182e2169",
"id": "e25b0de2",
"metadata": {},
"source": [
"> Personal recommendation: avoid global variables, they can make your code very hard to debug and understand"
]
},
{
"cell_type": "markdown",
"id": "f28f70c9",
"metadata": {},
"source": [
"# Handling files\n",
"- files are an important concept to store and access information\n",
"- to handle files, Python can read and write humanly readable files (more or less) directly\n",
"- for more complex input files (i.e. compressed or binary data files like bam), external libraries are required\n",
"\n",
"<br>\n",
"\n",
"- to open a file, we use the built-in function `open(filename,mode)`, which creates an iterator by line over the file\n",
"- to close a file, we use the method `.close()`. This makes sure, that everything has been written to the file. \n",
"\n",
"| mode | meaning | comment |\n",
"| :---: | :---: | :--- |\n",
"| 'r' | read | default |\n",
"| 'w' | write | overwrites files |\n",
"| 'a' | append | adds to the end of an existing file |\n",
"\n",
"- closing a data stream too early (i.e. when writing) will **not** create an error, but will become a problem later on"
"- the basic, all-purpose function for reading files with automatic closing:"
]
},
{
Expand All @@ -355,76 +277,48 @@
"metadata": {},
"outputs": [],
"source": [
"infile = open(\"../data/seqs.fas\", \"r\")\n",
"for line in infile:\n",
" print(\n",
" repr(line)\n",
" ) # repr() shows the string as it is stored in memory, including special characters like \\n for newlines\n",
" print(line)\n",
"infile.close()"
"with open(\"../data/seqs.fas\", \"r\") as infile:\n",
" for line in infile:\n",
" print(line)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c2b62fc1",
"cell_type": "markdown",
"id": "bce430f3",
"metadata": {},
"outputs": [],
"source": [
"outfile = open(\"../data/new_seqs.fas\", \"w\")\n",
"for e, seq in enumerate([\"aaaa\", \"cccc\", \"gggg\", \"tttt\"]):\n",
" outfile.write(f\">{e}\\n{seq}\\n\")\n",
"outfile.close()"
"- similar for writing files, but with mode 'w' or 'a':"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1c3f06fb",
"id": "c2b62fc1",
"metadata": {},
"outputs": [],
"source": [
"infile = open(\"../data/new_seqs.fas\", \"r\")\n",
"for line in infile:\n",
" print(\n",
" line.strip()\n",
" ) # removes white spaces and line breaks on the right side, i.e. aaaa\\n -> aaaa\n",
"infile.close()"
"with open(\"../data/new_seqs.fas\", \"w\") as outfile:\n",
" for e, seq in enumerate([\"aaaa\", \"cccc\", \"gggg\", \"tttt\"]):\n",
" outfile.write(f\">{e}\\n{seq}\\n\")"
]
},
{
"cell_type": "markdown",
"id": "543d3fda",
"metadata": {},
"source": [
"- there are alternative semantic ways to access files\n",
"- syntax is `with <your function creating a datastream> as <stream_variable_name>:` \n",
"- this implies a proper closing when leaving this code construct / finishing processing\n",
"- the keyword `as` function as an alias"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "62b4eb0a",
"id": "9274670b",
"metadata": {},
"outputs": [],
"source": [
"with open(\"../data/seqs.fas\", \"a\") as outfile:\n",
" for e, seq in enumerate(\n",
" [\"acgt\", \"tgca\"], start=4\n",
" ): # optional start parameter for enumerate, default is 0\n",
" outfile.write(f\">{e}\\n{seq}\\n\")"
"- as a sanity check, we import the written file again\n",
"- note the `.strip()` method which removes white space and line breaks on the right side"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c2adb46a",
"id": "1c3f06fb",
"metadata": {},
"outputs": [],
"source": [
"with open(\"../data/seqs.fas\", \"r\") as infile:\n",
"with open(\"../data/new_seqs.fas\", \"r\") as infile:\n",
" for line in infile:\n",
" print(line.strip())"
]
Expand All @@ -434,9 +328,11 @@
"id": "014e605c",
"metadata": {},
"source": [
"# Reading a whole file\n",
"## Reading a whole file\n",
"\n",
"- it is possible to read a whole file with the method `.read()` from an open file stream without processing it line by line\n",
"- this might not be advisable in most situations, because it just fills up the memory"
"- this might not be advisable in some situations, because it just fills up the memory\n",
"- note how all line breaks are part of one and the same string\n"
]
},
{
Expand All @@ -460,11 +356,12 @@
"# Exercises\n",
"\n",
"## Functions\n",
"\n",
"- write a function that serves as a calculator for the four basic operations (addition, subtraction, multiplication, division)\n",
"- the function should take three arguments: the first number, the second number, and the operation as a string (i.e. \"add\" or \"+\")\n",
"- the function should return the result of the operation\n",
"- the function should handle division by zero gracefully (i.e. return \"undefined\" or something similar)\n",
"- the function should have a doc string explaining its purpose and usage (enclosed by three quotes `'''`)"
"- the function should have a doc string explaining its purpose and usage (enclosed by three quotes `'''`)\n"
]
},
{
Expand All @@ -484,7 +381,7 @@
"metadata": {},
"source": [
"- create a second function to test the calculator function with a few examples, including division by zero\n",
"- the test function should print the results of the tests in a readable format"
"- the test function should print the results of the tests in a readable format\n"
]
},
{
Expand All @@ -504,7 +401,7 @@
"source": [
"- create a new function that given a nucleotide sequence returns the reverse complement of the sequence, focus on the 4 standard nucleotides (A, T, C, G) and ignore any other characters (e.g. N or -)\n",
"- use dictionaries for the conversion of the nucleotides\n",
"- the function should take a string as input and return a string as output"
"- the function should take a string as input and return a string as output\n"
]
},
{
Expand Down Expand Up @@ -545,9 +442,10 @@
"id": "2958315e",
"metadata": {},
"source": [
"## file handling\n",
"- create a function to create a new file and write the sequence from the previous exercise (ORFs) into it\n",
"- read the file and print its content to the console"
"## File handling\n",
"\n",
"- create a function that exports the sequence from the previous exercise (ORFs) into a new file\n",
"- read the file and print its content to the console\n"
]
},
{
Expand All @@ -570,7 +468,7 @@
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"display_name": "default",
"language": "python",
"name": "python3"
},
Expand All @@ -584,7 +482,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.14.3"
"version": "3.13.13"
}
},
"nbformat": 4,
Expand Down
Loading