MPUSP · finstermeier · Apr 30, 2026 · Apr 30, 2026
diff --git a/lessons/lesson_07.ipynb b/lessons/lesson_07.ipynb
@@ -18,7 +18,6 @@
     "  - gzip\n",
     "  - argparse\n",
     "  - math\n",
-    "  - re\n",
     "  - numpy\n",
     "  - pandas\n",
     "- tidy data"
@@ -252,134 +251,6 @@
     "print(\"Gamma:\", math.gamma(3))  # Gamma function at x"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "e44e0357",
-   "metadata": {},
-   "source": [
-    "# Library re\n",
-    "- `re` stands for `regular expression`, aka `regex`\n",
-    "- concept for text pattern matching\n",
-    "- Python converts the search pattern into a bytestring to search very efficiently in a memory object"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b1616d23",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import re\n",
-    "\n",
-    "# This text comes from the book \"20,000 Leagues Under the Sea\" by Jules Verne, published in 1870.\n",
-    "\n",
-    "story = \"\"\"On the 6th of November, 1867, the frigate Abraham Lincoln departed at 3:00 PM from Brooklyn pier.\n",
-    "The crew numbered 307 men and officers.\n",
-    "Captain Farragut had placed a reward of $2,000 for whoever first sighted the creature.\n",
-    "Professor Aronnax, a marine biologist from Paris, stood at the bow scanning the horizon.\n",
-    "The animal, if it exists, must be of considerable size — perhaps 200 feet in length.\n",
-    "The sea was calm; visibility extended roughly 15 nautical miles.\n",
-    "At latitude 31° 15' N, longitude 136° 42' E, they found nothing.\n",
-    "After 3 weeks with no sightings, the crew grew restless.\n",
-    "Then, on November 28th at 11:17 PM, the lookout cried: Object sighted — bearing 315 degrees!\n",
-    "The creature emitted a pale phosphorescent light and moved at approximately 40 knots.\n",
-    "Aronnax estimated its mass at no less than 1,500 tons.\n",
-    "Impossible, said Conseil quietly, and yet — there it is.\"\"\"\n",
-    "\n",
-    "# create pattern\n",
-    "pattern = \".*\\d{1,2}:\\d{2} PM.*\"\n",
-    "regex = re.compile(pattern)\n",
-    "\n",
-    "# search for pattern and print each line with the pattern in it\n",
-    "print(regex.findall(story))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "93a3e5a6",
-   "metadata": {},
-   "source": [
-    "- certain strings have specific meanings:\n",
-    "  - `.*`    = any number of any character before/after our pattern except `\\n`, including 0 observations\n",
-    "  - `*`     = any number of any character within our pattern, including 0 observations\n",
-    "  - `\\d{1,2}` = one or two digits\n",
-    "- when compiling the search pattern, you can include certain flags\n",
-    "  - `re.IGNORECASE` to have case insensitive matching\n",
-    "  - `re.DOTALL` to have the `.` match all characters incl. the line end character `\\n`\n",
-    "  - `re.MULTILINE` to handle multiple lines in a string separately, relevant for:\n",
-    "    - `^` = beginning of a string / line\n",
-    "    - `$` = end of a string/line\n",
-    "- to combine multiple flags, use the vertical line `|`, i.e. `re.DOTALL | re.MULTILINE`"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5e10a102",
-   "metadata": {},
-   "source": [
-    "- some special characters in search pattern:\n",
-    "\n",
-    "| Character | Meaning |\n",
-    "| :---: | :--- |\n",
-    "| . | any character except new line '\\n' |\n",
-    "| ^ | at the beginning of a string |\n",
-    "| $ | at the end of a string |\n",
-    "| * | multiplier >= 0 |\n",
-    "| + | multiplier >=1 |\n",
-    "| ? | multiplier 0-1 |\n",
-    "| {m} | specific multiplier, i.e. {3} |\n",
-    "| {m,n} | multiplier range, i.e. {2,4}, also {,4} or {4,} for half-open ranges |\n",
-    "| [ ] | character set to choose from, i.e. [ACGT], special characters become normal characters, i.e. [ab*] |\n",
-    "| [a-z] | a single lower case letter |\n",
-    "| [0-9] | a single digit |\n",
-    "| \\ | escape character, i.e. \\* is an asterisk and not a multiplier |\n",
-    "| \\| | logical or when combining |\n",
-    "  "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8964ea64",
-   "metadata": {},
-   "source": [
-    "- several subfunctions are available for a pattern object\n",
-    "- below is an overview of the search functions and their result\n",
-    "- all expect a compiled pattern via `re.compile(<string>)` and the string to search in, flags can always be added after the string\n",
-    "\n",
-    "\n",
-    "| Subfunction | Description |\n",
-    "| :--- | :--- |\n",
-    "| `pattern.search(string)` | first match object |\n",
-    "| `pattern.match(string)` | matching object, but tests only the beginning of the string |\n",
-    "| `pattern.fullmatch(string)` | matching object only if whole string matches, otherwise returns RE |\n",
-    "| `pattern.findall(string)` | list of match |\n",
-    "| `pattern.finditer(string)` | iterator over match objects, similar to list of `.findall()` |\n",
-    "| `pattern.split(string,maxsplit=0)` | splits string based on occurance of the pattern, limited by maxsplit |\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "423d4b35",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "sequence = \"\"\"ATGGATAAGAAATACTCAATAGGCTTAGATATCGGCACAAATAGCGTCGGATGGGCGGTGATCACTGATG\n",
-    "AATATAAGGTTCCGTCTAAAAAGTTCAAGGTTCTGGGAAATACAGACCGCCACAGTATCAAAAAAAATCT\n",
-    "TATAGGGGCTCTTTTATTTGACAGTGGAGAGACAGCGGAAGCGACTCGTCTCAAACGGACAGCTCGTAGA\"\"\"\n",
-    "pattern = re.compile(\n",
-    "    \"AT[ACT]GG[ACGT]\"\n",
-    ")  # represents AA sequence 'IG' = Isoleucine + Glycine\n",
-    "\n",
-    "print(\"first match\", pattern.search(sequence))\n",
-    "print(\"match beginning\", pattern.match(sequence))\n",
-    "print(\"whole string match\", pattern.fullmatch(sequence))\n",
-    "print(\"list of matches\", pattern.findall(sequence))\n",
-    "print(\"iterator for matches\", pattern.finditer(sequence))\n",
-    "print(\"split at matches\", pattern.split(sequence))"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "fad61f15",
@@ -604,7 +475,7 @@
    "metadata": {},
    "source": [
     "## Why to use numpy?\n",
-    "- numpy (and scipy) are fast, really fast\n",
+    "- numpy (also scipy) are fast, really fast\n",
     "- for demonstration purposes, we will create 10K random numbers and add them together. We will repeat the step for the addition several times and test the performance with a (Jupyter) built-in function `%timeit`\n",
     "- we will compare numpy with a for loop"
    ]
@@ -1127,24 +998,6 @@
    "outputs": [],
    "source": []
   },
-  {
-   "cell_type": "markdown",
-   "id": "1aa2b2d8",
-   "metadata": {},
-   "source": [
-    "- look for any stop codons in your sequence using regular expressions\n",
-    "- stop codons: UAA, UAG, UGA\n",
-    "- print to screen the positions for each stop codon"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c56a8cf0",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
   {
    "cell_type": "markdown",
    "id": "8c660920",

diff --git a/lessons/lesson_08.ipynb b/lessons/lesson_08.ipynb
@@ -19,7 +19,6 @@
     "  - gzip\n",
     "  - argparse\n",
     "  - math\n",
-    "  - re\n",
     "  - numpy\n",
     "  - pandas\n",
     "- tidy data"

diff --git a/solutions/solutions_07.ipynb b/solutions/solutions_07.ipynb
@@ -35,35 +35,6 @@
     "random_sequence = \"\".join(random.choices(list(\"ACGU\"), k=length))"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "1aa2b2d8",
-   "metadata": {},
-   "source": [
-    "- look for any stop codons in your sequence using regular expressions\n",
-    "- stop codons: UAA, UAG, UGA\n",
-    "- print to screen the positions for each stop codon"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c56a8cf0",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import re\n",
-    "\n",
-    "pattern = re.compile(\"UAA|UAG|UGA\")\n",
-    "\n",
-    "positions = []\n",
-    "\n",
-    "for p in re.finditer(pattern, random_sequence):\n",
-    "    positions.append(p.span())\n",
-    "\n",
-    "print(positions)"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "8c660920",