Skip to content

Commit 34d1648

Browse files
committed
init web
0 parents  commit 34d1648

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

97 files changed

+81388
-0
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
.DS_store
2+
.idea

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2024 Atharva Sehgal
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# escher-web
2+
3+
This is the repository that contains source code for the [website for escher](https://github.com/trishullab/escher) hosted on [https://trishullab.github.io/escher-web/](https://trishullab.github.io/escher-web/).
4+
5+
6+
This is the general workflow I follow to convert a slide deck into a scrollytelling website:
7+
8+
2. Export slide decks as PDF files `static/escher-slides.pdf`.
9+
3. `brew install pdf2svg` and then run `static/extract-slides.sh` which defines the logic to extract each slide into folders with relevant frames.
10+
11+
4. Open `index.html` and edit the content to match the slide deck. Here is how the directory of frames integrates into a scrollytelling section:
12+
```html
13+
<section class="section">
14+
<div class="container">
15+
<h2 class="title is-2">Heading</h2>
16+
<!-- ID helps scrollama identify which section to update -->
17+
<div class="columns is-centered" id="pysr">
18+
<div class="column is-max-mobile is-max-tablet is-max-desktop is-max-widescreen article">
19+
<h3 class="title is-size-6-mobile is-size-4-tablet">Sketch of PySR's search space</h3>
20+
<div class="content is-size-7-mobile is-size-6-tablet has-text-left step">
21+
...
22+
</div>
23+
<!-- More sections like this for each image. -->
24+
</div>
25+
<!-- Image. -->
26+
<div class="column content">
27+
<!-- Change to point to the correct folder. -->
28+
<img src="static/pysr-frames/1.svg" id="updateableFigure" loading="eager">
29+
</div>
30+
</div>
31+
</div>
32+
</section>
33+
34+
<!-- More sections like this for each folder of frames. -->
35+
<!-- At the end -->
36+
<script>
37+
// Use mobile layout.
38+
mobileCorrections();
39+
// Init scrollable sections.
40+
init("#scientific-discovery");
41+
// This is the ID of the section we just defined.
42+
init("#pysr");
43+
init("#lasr-learning-loop");
44+
init("#lasr-results");
45+
</script>
46+
```
47+
48+
49+
Check out the source code for COSMOS and LaSR's website for similar examples. Available here:
50+
- [https://trishullab.github.io/cosmos-web/](https://trishullab.github.io/cosmos-web/)
51+
- [https://trishullab.github.io/lasr-web/](https://trishullab.github.io/lasr-web/)

index.html

Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
<!DOCTYPE html>
2+
<html>
3+
4+
<head>
5+
<meta charset="utf-8">
6+
<meta name="description" content="🏎️: Evaluating Agentic Superoptimization on Large Codebases">
7+
<meta name="keywords"
8+
content="FormulaCode, Visual Programming, Computer Vision, Context bottleneck Models, Scientific Discovery, Neurosymbolic Learning, Program Synthesis, Computer Vision">
9+
<meta name="viewport" content="width=device-width, initial-scale=1">
10+
<title>FormulaCode: Evaluating Agentic Superoptimization on Large Codebases</title>
11+
12+
<script>
13+
window.dataLayer = window.dataLayer || [];
14+
15+
function gtag() {
16+
dataLayer.push(arguments);
17+
}
18+
19+
gtag('js', new Date());
20+
gtag('config', 'G-PYVRSFMDRL');
21+
</script>
22+
23+
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet">
24+
25+
<link rel="stylesheet" href="./static/css/bulma.min.css">
26+
<link rel="stylesheet" href="./static/css/bulma-carousel.min.css">
27+
<link rel="stylesheet" href="./static/css/bulma-slider.min.css">
28+
<link rel="stylesheet" href="./static/css/fontawesome.all.min.css">
29+
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
30+
<link rel="stylesheet" href="./static/css/index.css">
31+
<link rel="stylesheet" href="./static/css/scrollytelling.css">
32+
<link rel="icon" href="https://fav.farm/🌀" type="image/x-icon">
33+
34+
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
35+
<script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
36+
<script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
37+
38+
<script defer src="./static/js/fontawesome.all.min.js"></script>
39+
<script src="./static/js/bulma-carousel.min.js"></script>
40+
<script src="./static/js/bulma-slider.min.js"></script>
41+
<script src="./static/js/index.js"></script>
42+
</head>
43+
44+
<body>
45+
46+
<section class="hero">
47+
<div class="hero-body">
48+
<div class="container is-max-desktop">
49+
<div class="columns is-centered">
50+
<div class="column has-text-centered">
51+
<h1 class="title is-1 publication-title"><span class="formulacode">FormulaCode</span>:
52+
Evaluating Agentic Superoptimization on Large Codebases </h1>
53+
<div class="is-size-5 publication-authors">
54+
<span class="author-block">
55+
<a href="https://atharvas.net">Atharva Sehgal</a><sup>1*</sup>,</span>
56+
<span class="author-block">
57+
<a href="https://www.linkedin.com/in/jamesahou/">James Hou</a><sup>3*</sup>,</span>
58+
<span class="author-block">
59+
<a href="https://www.cs.utexas.edu/~swarat">Swarat Chaudhuri</a><sup>1</sup>,
60+
</span>
61+
<span class="author-block">
62+
<a href="https://www.jenjsun.com/">Jennifer Sun</a><sup>2</sup>,</span>
63+
<span class="author-block">
64+
<a href="https://www.cms.caltech.edu/people/yyue/">Yisong Yue</a><sup>3</sup></span>
65+
</div>
66+
<div class="is-size-5 publication-authors">
67+
<span class="author-block"><sup>1</sup>UT Austin,</span>
68+
<span class="author-block"><sup>2</sup>Cornell </span>
69+
<span class="author-block"><sup>3</sup>Caltech</span>
70+
<span class="author-block"><sup>*</sup>Equal Contribution</span>
71+
</div>
72+
73+
<div class="column has-text-centered">
74+
<div class="publication-links">
75+
<span class="link-block">
76+
<a href="static/paper.pdf"
77+
class="external-link button is-normal is-rounded is-dark">
78+
<span class="icon">
79+
<i class="fas fa-file-pdf"></i>
80+
</span>
81+
<span>Paper</span>
82+
</a>
83+
</span>
84+
<!-- Code Link. -->
85+
<span class="link-block">
86+
<a href="https://github.com/formula-code"
87+
class="external-link button is-normal is-rounded is-dark">
88+
<span class="icon">
89+
<i class="fab fa-github"></i>
90+
</span>
91+
<span>Code</span>
92+
</a>
93+
</span>
94+
<!-- <span class="link-block">
95+
<a href="https://example.com"
96+
class="external-link button is-normal is-rounded is-dark">
97+
<span class="icon">
98+
<i class="fas fa-external-link-alt"></i>
99+
</span>
100+
<span>Short Slide Deck</span>
101+
</a>
102+
</span> -->
103+
<span class="link-block">
104+
<a href="./static/icmlpral-poster.pdf"
105+
class="external-link button is-normal is-rounded is-dark">
106+
<span class="icon">
107+
<i class="fas fa-external-link-alt"></i>
108+
</span>
109+
<span>ICML-PRAL Poster</span>
110+
</a>
111+
</span>
112+
</div>
113+
114+
</div>
115+
</div>
116+
</div>
117+
</div>
118+
</div>
119+
</section>
120+
121+
<section class="hero teaser">
122+
<div class="container is-max-desktop">
123+
<div class="hero-body">
124+
<img src="./static/images/teaser.svg" style="max-width: 100%; height: auto;" loading="eager">
125+
<div class="subtitle has-text-centered is-size-6">
126+
Test cases streamline performance evaluation but constrain coding agents (e.g., <a
127+
href="https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/">AlphaEvolve</a>)
128+
to a pass/fail reward – a signal too sparse for fostering iterative optimizations. <span
129+
class="formulacode">FormulaCode</span> introduces a live
130+
repository-level benchmark that complements existing work (In gray (<a
131+
href="https://www.swebench.com/">SWE-Bench</a>)) by challenging agents to
132+
optimize 451 real-world performance bottlenecks against human solutions drawn from
133+
community-maintained benchmarks
134+
(in light blue). These benchmarks provide evaluation functions that capture fine-grained performance
135+
insights, are less
136+
susceptible to data leakage, and expose a larger optimization surface to coding agents.
137+
</div>
138+
139+
</div>
140+
</div>
141+
</section>
142+
143+
144+
<section class="section">
145+
<div class="container is-max-desktop">
146+
<!-- Abstract. -->
147+
<div class="columns is-centered has-text-centered">
148+
<div class="column is-four-fifths">
149+
<h2 class="title is-3">Abstract</h2>
150+
<div class="content has-text-justified">
151+
<p>
152+
Rapid advances in LLM agents have shown the ability to optimize code using continuous
153+
objective functions — a significant leap beyond traditional code generation techniques.
154+
However, there is an urgent need for novel benchmarks that can effectively measure this
155+
capability and translate it into real-world impact. Current code benchmarks, which often
156+
rely on binary pass/fail outcomes, offer a limited evaluation framework that falls short of
157+
capturing the full potential of these emerging capabilities.
158+
</p>
159+
<p>
160+
To bridge this gap, we introduce <span class="formulacode">FormulaCode</span>, a novel
161+
benchmark designed for evaluating agentic superoptimization on large codebases, with a focus
162+
on real-world performance optimization. Constructed from a dataset of 451 real-world
163+
performance bottlenecks automatically mined from Github, FormulaCode enables comprehensive
164+
testing of an agent's ability to triage, diagnose, and resolve inefficiencies in realistic
165+
software environments.
166+
</p>
167+
<p>
168+
FormulaCode proves to be a challenging benchmark for frontier LLMs and agentic frameworks,
169+
with unrestricted repository exploration emerging as a principal component for finding
170+
performance inefficiencies. By introducing FormulaCode, our goal is to drive the development
171+
of next-generation optimization algorithms that meet the rigorous demands of real-world
172+
software projects.
173+
</p>
174+
</div>
175+
</div>
176+
</div>
177+
<!--/ Abstract. -->
178+
</div>
179+
</section>
180+
181+
182+
<section class="section">
183+
<div class="container is-max-desktop">
184+
<div class="columns is-centered has-text-centered">
185+
<div class="column is-four-fifths">
186+
<h2 class="title is-3">⚠️ Work in progress. Check back in a few days for updates!</h2>
187+
<div class="content has-text-justified">
188+
</div>
189+
</div>
190+
</div>
191+
</div>
192+
</section>
193+
194+
<section class="section">
195+
<div class="container is-max-desktop">
196+
<div class="columns is-centered">
197+
<div class="column is-full-width">
198+
<h2 class="title is-3">Related Links</h2>
199+
200+
<div class="content has-text-left">
201+
<p>
202+
This project would not be possible without the excellent work of the community. These are
203+
some relevant papers to better understand the
204+
premise of our work:
205+
</p>
206+
<ul>
207+
<li><a href="https://arxiv.org/abs/2310.06770">SWE-bench: Can Language Models Resolve Real-World GitHub Issues?</a> </li>
208+
<li><a href="https://arxiv.org/abs/2401.03065">CRUXEval: Code Reasoning, Understanding, and Execution Evaluation</a> </li>
209+
<li><a href="https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/">AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms</a> </li>
210+
<li><a href="https://arxiv.org/abs/2210.05050 ">Neurosymbolic Programming for Science</a>
211+
</li>
212+
</ul>
213+
214+
</div>
215+
</div>
216+
</div>
217+
218+
</div>
219+
</section>
220+
221+
222+
<section class="section" id="BibTeX">
223+
<div class="container is-max-desktop content">
224+
<h2 class="title">BibTeX</h2>
225+
<p>
226+
If you found this post interesting, please read <a href="static/paper.pdf">our
227+
paper</a> for mathematical details and
228+
experimental results. You can cite our paper as follows:
229+
</p>
230+
<pre><code>@misc{sehgal2025selfevolvingvisualconceptlibrary,
231+
title={Evaluating Agentic Superoptimization on Large Codebases},
232+
author={Atharva Sehgal and Patrick Yuan and Ziniu Hu and Yisong Yue and Jennifer J. Sun and Swarat Chaudhuri},
233+
year={2025},
234+
eprint={????.?????},
235+
archivePrefix={arXiv},
236+
primaryClass={cs.CV},
237+
url={https://arxiv.org/abs/????.?????},
238+
}</code></pre>
239+
</div>
240+
</section>
241+
242+
<footer class="footer">
243+
<div class="container">
244+
<div class="content has-text-centered">
245+
<a class="icon-link" href="static/paper.pdf">
246+
<i class="fas fa-file-pdf"></i>
247+
</a>
248+
<a class="icon-link" href="https://github.com/formula-code" class="external-link" disabled>
249+
<i class="fab fa-github"></i>
250+
</a>
251+
</div>
252+
<div class="columns is-centered">
253+
<div class="column is-8">
254+
<div class="content">
255+
<p>
256+
This template is based on the <a href="https://nerfies.github.io/">Nerfiles</a> project
257+
page.
258+
The source code is available <a href="https://github.com/nerfies/nerfies.github.io">here</a>
259+
and is
260+
licensed under a <a rel="license"
261+
href="http://creativecommons.org/licenses/by-sa/4.0/">Creative
262+
Commons Attribution-ShareAlike 4.0 International License</a>. I also make heavy use of
263+
the
264+
<a href="https://github.com/russellsamora/scrollama">Scrollama.js</a> package. Please
265+
remember
266+
to cite either the <a href="https://nerfies.github.io/">Nerfiles</a> website or
267+
<a href="https://github.com/trishullab/FormulaCode-web">this website</a> if you use this
268+
template!
269+
</p>
270+
</div>
271+
</div>
272+
</div>
273+
</div>
274+
</footer>
275+
276+
<script src="./static/css/d3.min.js"></script>
277+
<script src="./static/scrollama.js"></script>
278+
<script src="./static/js/scrollytelling.js"></script>
279+
<script>
280+
// Init scrollable sections.
281+
mobileCorrections();
282+
// init("#scientific-discovery");
283+
// init("#cbd");
284+
// init("#FormulaCode-iterations-loop");
285+
// init("#FormulaCode-results");
286+
</script>
287+
</body>
288+
289+
</html>

static/13.png

1.3 MB
Loading

0 commit comments

Comments
 (0)