-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
160 lines (150 loc) · 7.73 KB
/
Copy pathindex.html
File metadata and controls
160 lines (150 loc) · 7.73 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="Project ideas for the USI Computer Vision & Pattern Recognition course, Spring 2026.">
<title>Computer Vision & Pattern Recognition | USI</title>
<link rel="icon" href="assets/favicon.svg" type="image/svg+xml">
<link rel="stylesheet" href="styles.css">
</head>
<body>
<nav class="site-nav" aria-label="Main navigation">
<div class="brand">USI Computer Vision 2026</div>
<div class="nav-links">
<a href="#abstract">Course</a>
<a href="#projects">Projects</a>
<a href="https://github.com/Computer-Vision-2026" target="_blank" rel="noopener">GitHub</a>
<a href="https://search.usi.ch/courses/35275471/computer-vision-pattern-recognition" target="_blank" rel="noopener">Official profile</a>
</div>
</nav>
<header class="hero">
<div class="hero-content">
<h1>Computer Vision</h1>
<p class="hero-copy">
A Spring 2026 course connecting the historical foundations of vision with modern deep learning,
transformers, and hands-on project work.
</p>
</div>
</header>
<main>
<section id="abstract">
<div class="section-inner about-grid">
<div class="about-copy">
<div class="section-heading">
<h2>Abstract</h2>
</div>
<p>
Machine learning has profoundly changed computer vision, but the field's current methods build on a long
history of image formation, geometry, perception, recognition, and representation learning. This lecture
takes a holistic view of the task of vision.
</p>
<p>
Lectures and tutorials are accompanied by bi-weekly quizzes and project work. Assessment combines
bi-weekly quizzes, the project, and the final exam.
</p>
<div class="topic-list" aria-label="Course topics">
<span class="topic">Foundations of vision</span>
<span class="topic">Deep learning</span>
<span class="topic">Transformers</span>
</div>
</div>
<ul class="detail-list" aria-label="Course details">
<li>
<b>Instructor</b>
<span><a href="https://francisengelmann.github.io/" target="_blank" rel="noopener">Francis Engelmann</a></span>
</li>
<li>
<b>Assistant</b>
<span><a href="https://nihermann.github.io/" target="_blank" rel="noopener">Nicolai Hermann</a></span>
</li>
<li>
<b>Format</b>
<span>Lectures, tutorials, bi-weekly quizzes, project work, and final exam.</span>
</li>
<li>
<b>Bibliography</b>
<span><i>Foundations of Computer Vision</i>, Antonio Torralba, Phillip Isola, William T. Freeman, MIT Press, 2024.</span>
</li>
<li>
<b>Programs</b>
<span>MSc Artificial Intelligence, MSc Computational Science, MSc Informatics, and Faculty of Informatics PhD students.</span>
</li>
</ul>
</div>
</section>
<section class="band" id="projects">
<div class="section-inner">
<div class="section-heading">
<h2>Student Project Ideas</h2>
<p>
Project proposals live here so classmates can quickly scan possible directions, compare ideas, and submit
additions by pull request.
</p>
</div>
<div class="projects-grid">
<!-- Copy this article to add a new project idea. Keep the teaser visual and write a 60-90 word pitch. -->
<article class="project-card">
<div class="teaser" role="img" aria-label="Abstract computer vision teaser with image grid, camera frame, and detected regions.">
<span class="teaser-label">Example project idea</span>
</div>
<div class="project-content">
<p class="project-meta">Scene understanding, geometry, foundation models</p>
<h3>Semantic Change Maps from Everyday Walks</h3>
<p class="project-abstract">
Can a short phone video reveal how a campus route changes over time? This project combines monocular
depth, semantic segmentation, and feature matching to align walks recorded on different days, then
highlights moved objects, blocked paths, or new scene elements. The result would be a small visual demo
that connects 3D reasoning, human attention, and practical scene understanding for urban navigation.
</p>
<label class="project-toggle-label">
<input class="project-toggle" type="checkbox" aria-label="Toggle full project pitch">
<span class="project-toggle-more">Read more</span>
<span class="project-toggle-less">Show less</span>
</label>
</div>
</article>
<article class="project-card">
<div class="teaser" role="img" aria-label="Two hands playing rock-paper-scissors, but one holds a banana instead of a valid sign, illustrating anomaly detection.">
<img src="assets/group_J.png" alt="" style="position:absolute; inset:0; width:100%; height:100%; object-fit:cover; z-index:2;">
<span class="teaser-label" style="z-index:3;">Group J</span>
</div>
<div class="project-content">
<p class="project-meta">Self-supervised video representations, foundation models, anomaly detection</p>
<h3>Probing V-JEPA 2: What Does a Video Model Actually See?</h3>
<p class="project-abstract">
V-JEPA 2 is Meta’s self-supervised video encoder, trained without labels to predict masked
spatio-temporal regions. We want to open up its latent space and understand how it reacts to the
visual world — and, more interestingly, to things that don’t belong in it. Starting from a frozen
pretrained encoder, we build an interactive demo that embeds short clips and surfaces structure,
similarity, and drift over time. On top of this, we explore anomaly detection as a concrete
application: can the embedding space tell a banana from a pair of scissors in a rock-paper-scissors
game, a boat on a highway, or an abnormal beat in an ECG recording?
</p>
<label class="project-toggle-label">
<input class="project-toggle" type="checkbox" aria-label="Toggle full project pitch">
<span class="project-toggle-more">Read more</span>
<span class="project-toggle-less">Show less</span>
</label>
</div>
</article>
<article class="project-card add-project-card">
<a href="https://github.com/Computer-Vision-2026/Computer-Vision-2026.github.io/edit/main/index.html" target="_blank" rel="noopener">
<span class="add-project-icon" aria-hidden="true">+</span>
<strong>Add Your Own</strong>
<span>Open GitHub, copy the example card, and submit your project pitch as a pull request.</span>
</a>
</article>
</div>
</div>
</section>
</main>
<footer>
<div class="footer-inner">
<span>Computer Vision & Pattern Recognition, USI, Spring 2026</span>
<a href="https://github.com/Computer-Vision-2026" target="_blank" rel="noopener">GitHub organization</a>
<a href="https://search.usi.ch/courses/35275471/computer-vision-pattern-recognition" target="_blank" rel="noopener">Official course profile</a>
</div>
</footer>
</body>
</html>