Skip to content

Commit ad2cc24

Browse files
authored
Concept: Here Documents (#737)
* WIP Heredoc concept * add here strings * add links * updates to concept readme * add heredocs to syllabus document * review suggestions * review suggestions * a more active voice * Adding a real-world example and possible drawbacks. Also replacing the image in the concept readme with a mermaid diagram. * introduce "heresstring" word * review comments, and populating intro doc
1 parent bcb74ca commit ad2cc24

7 files changed

Lines changed: 683 additions & 9 deletions

File tree

concepts/README.md

Lines changed: 27 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,24 @@
22

33
The [plan](http://forum.exercism.org/t/bash-syllabus-planning/11952)
44

5-
The suggested concept flow:
6-
7-
[![bash syllabus concept flowchart](https://glennj.github.io/img/bash.syllabus.flow.png)](http://forum.exercism.org/t/bash-syllabus-flow/15038)
5+
## Concept Flow:
6+
7+
```mermaid
8+
erDiagram
9+
"Commands and Arguments" ||--|| Variables : ""
10+
Variables ||--|| "The Importance of Quoting" : ""
11+
"The Importance of Quoting" ||--|| Conditionals : ""
12+
"The Importance of Quoting" ||--|| Arrays : ""
13+
"The Importance of Quoting" ||--|| "Pipelines and Command Lists" : ""
14+
Conditionals ||--|| Arithmetic : ""
15+
Conditionals ||--|| Looping : ""
16+
Arrays ||--|| "More About Arrays" : ""
17+
"Pipelines and Command Lists" ||--|| Functions : ""
18+
Functions ||--|| Redirection : ""
19+
Redirection ||..|| "Command Substitution" : TODO
20+
Redirection ||--|| "Here Documents" : ""
21+
"Command Substitution" ||..|| "Process Substitution" : TODO
22+
```
823

924
1. Basic syntax: commands and arguments
1025

@@ -95,23 +110,26 @@ The suggested concept flow:
95110
```
96111
- sublist syntax `${ary[@]:offset:length}`
97112

98-
11. I/O
113+
11. Redirection
99114
- file descriptors, stdin, stdout, stderr
100115
- redirection
116+
117+
12. Here Documents
101118
- here-docs and here-strings
119+
120+
## More Concepts to Add
121+
122+
- I/O continued
102123
- command substitution
103124
- capturing stdout and stderr
104125
- capturing stdout and stderr **into separate variables**
105-
- `exec` and redirections
106126
- process substitutions
107127

108-
## More Concepts
109-
110128
- brace expansions and how it's different from patterns `/path/to/{foo,bar,baz}.txt`
111129
112-
x. option parsing with getopts
130+
- option parsing with getopts
113131
114-
x. `set` command and "strict mode"
132+
- `set` command and "strict mode"
115133
116134
- pros and cons of
117135
- `set -e`
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
{
2+
"authors": [
3+
"glennj"
4+
],
5+
"contributors": [
6+
"IsaacG",
7+
"kotp"
8+
],
9+
"blurb": "Here Documents redirect an inline document to the standard input of a command."
10+
}

concepts/heredocs/about.md

Lines changed: 313 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,313 @@
1+
# About Here Documents
2+
3+
In Bash scripting, a "here document" (or "heredoc") redirects multiple lines of input to a command or program, as if you were typing them directly into the terminal.
4+
It's a powerful tool for embedding multi-line text within your scripts without needing external files or complex string manipulation.
5+
6+
## Key Features and Syntax
7+
8+
1. Delimiter: a heredoc starts with the `<<` operator followed by a delimiter word (often called the "marker" or "terminator").
9+
This delimiter can be any word you choose, but it's common to use something like `EOF`, `END`, or `TEXT` for clarity.
10+
For more readable code, you can use something descriptive as the delimiter, for example `END_INSTALLATION_INSTRUCTIONS`.
11+
12+
1. Content: after the initial `<< DELIMITER`, you write the content you want to redirect.
13+
This can be multiple lines of text, code, or anything else.
14+
15+
1. Termination: the heredoc ends when the delimiter word appears again on a line by itself, with no leading or trailing whitespace.
16+
17+
## Basic Syntax
18+
19+
```bash
20+
command << DELIMITER
21+
Content line 1
22+
Content line 2
23+
...
24+
Content line N
25+
DELIMITER
26+
```
27+
28+
## How it Works
29+
30+
* Bash reads all the lines between the starting `<< DELIMITER` and the ending `DELIMITER`.
31+
* Bash connects this content to the command's standard input.
32+
* The command processes this input as if it were coming from the keyboard.
33+
34+
### Example 1: Simple Text Output
35+
36+
```bash
37+
cat << EOF
38+
This is the first line.
39+
This is the second line.
40+
This is the third line.
41+
EOF
42+
```
43+
44+
Output:
45+
46+
```plaintext
47+
This is the first line.
48+
This is the second line.
49+
This is the third line.
50+
```
51+
52+
In this example:
53+
54+
* `cat` is the command.
55+
* `<< EOF` starts the heredoc with `EOF` as the delimiter.
56+
* The three lines of text are the content.
57+
* `EOF` on its own line ends the heredoc.
58+
* `cat` then outputs the content it received.
59+
60+
### Example 2: Using with wc (Word Count)
61+
62+
```bash
63+
wc -l << END
64+
Line 1
65+
Line 2
66+
Line 3
67+
END
68+
```
69+
70+
Output:
71+
72+
```plaintext
73+
3
74+
```
75+
76+
Here, `wc -l` counts the number of lines.
77+
The heredoc provides the three lines as input.
78+
79+
### Example 3: Passing data to a script
80+
81+
The script:
82+
83+
```bash
84+
#!/usr/bin/env bash
85+
86+
# Script to process input
87+
while IFS= read -r line; do
88+
echo "Processing: $line"
89+
done
90+
```
91+
92+
Call the script from an interactive bash prompt with a heredoc:
93+
94+
```bash
95+
./your_script << MY_DATA
96+
Item 1
97+
Item 2
98+
Item 3
99+
MY_DATA
100+
```
101+
102+
Output:
103+
104+
```plaintext
105+
Processing: Item 1
106+
Processing: Item 2
107+
Processing: Item 3
108+
```
109+
110+
## Variations and Advanced Features
111+
112+
### Literal Content
113+
114+
Bash performs variable expansion, command substitution, and arithmetic expansion within a heredoc.
115+
In this sense, heredocs act like double quoted strings.
116+
117+
```bash
118+
cat << EOF
119+
The value of HOME is $HOME
120+
The current date is $(date)
121+
Two plus two is $((2 + 2))
122+
EOF
123+
```
124+
125+
Output:
126+
127+
```plaintext
128+
The value of HOME is /home/glennj
129+
The current date is Thu Apr 24 13:47:32 EDT 2025
130+
Two plus two is 4
131+
```
132+
133+
When the delimiter is quoting (using single or double quotes), these expansions are prevented.
134+
The content is taken literally.
135+
This is like single quoted strings.
136+
137+
```bash
138+
cat << 'EOF'
139+
The value of $HOME is not expanded here.
140+
The result of $(date) is not executed.
141+
Two plus two is calculated by $((2 + 2))
142+
EOF
143+
```
144+
145+
Output:
146+
147+
```plaintext
148+
The value of $HOME is not expanded here.
149+
The result of $(date) is not executed.
150+
Two plus two is calculated by $((2 + 2))
151+
```
152+
153+
### Stripping Leading Tabs
154+
155+
If you use `<<-` (with a trailing hyphen) instead of `<<`, Bash will strip any leading _tab characters_ from each line of the heredoc.
156+
This is useful for indenting the heredoc content within your script without affecting the output.
157+
158+
```bash
159+
# Note, the leading whitespace is tab characters only, not spaces!
160+
# The ending delimiter can have leading tabs as well.
161+
cat <<- END
162+
This line has 1 leading tab.
163+
This line has a leading tab and some spaces.
164+
This line 2 leading tabs.
165+
END
166+
```
167+
168+
The output is printed with all the leading tabs removed:
169+
170+
```plaintext
171+
This line has 1 leading tab.
172+
This line has a leading tab and some spaces.
173+
This line 2 leading tabs.
174+
```
175+
176+
~~~~exercism/caution
177+
The author doesn't recommend this usage.
178+
While it can improve the readability of the script,
179+
180+
1. it's easy to accidentally replace the tab characters with spaces (your editor may do this automatically), and
181+
1. it's hard to spot the difference between spaces and tabs.
182+
~~~~
183+
184+
## When to Use Here Documents
185+
186+
* Multi-line input: when you need to provide multiple lines of text to a command.
187+
* Configuration files: embedding small configuration snippets within a script.
188+
* Generating code: creating code on the fly within a script.
189+
* Scripting interactions: simulating user input for interactive programs.
190+
* Avoiding external files: when you want to avoid creating temporary files.
191+
192+
A typical usage might be to provide some help text:
193+
194+
```bash
195+
#!/usr/bin/env bash
196+
197+
usage() {
198+
cat << END_USAGE
199+
Refresh database tables.
200+
201+
usage: ${0##*/} [-h|--help] [-A|--no-archive]
202+
203+
where: --no-archive flag will _skip_ the archive jobs
204+
END_USAGE
205+
}
206+
207+
# ... parsing command line options here ...
208+
209+
if [[ $flag_help == "true" ]]; then
210+
usage
211+
exit 0
212+
fi
213+
```
214+
215+
## Possible Drawbacks
216+
217+
* Large embedded documents can make your code harder to read.
218+
It can be better to deploy your script with documentation in separate files.
219+
* Here documents can break the flow of the code.
220+
You might be in a deeply nested section of code when you want to pass some text to a program.
221+
The heredoc's indentation can look jarring compared to the surrounding code.
222+
223+
## Here Strings
224+
225+
Like here documents, _here strings_ (or "herestrings") provide input to a command.
226+
However, while heredocs are given as a block of text, herestrings are given as a single string of text.
227+
Here strings use the `<<< "text"` syntax.
228+
229+
```bash
230+
tr 'a-z' 'A-Z' <<< "upper case this string"
231+
```
232+
233+
Output:
234+
235+
```plaintext
236+
UPPER CASE THIS STRING
237+
```
238+
239+
Unlike heredocs, no ending delimiter is required.
240+
241+
### Why Use Here Strings?
242+
243+
A pipeline can be used instead of a here string:
244+
245+
```bash
246+
echo "upper case this string" | tr 'a-z' 'A-Z'
247+
```
248+
249+
So why use a here string?
250+
251+
Consider the case where you get the string as output from a long-running computation, and you want to feed the result to two separate commands.
252+
Using pipelines, you have to execute the computation twice:
253+
254+
```bash
255+
some_long_running_calculation | first_command
256+
some_long_running_calculation | second_command
257+
```
258+
259+
A more efficient approach is to capture the output of the computation (using command substutition), and use here strings to provide input to the two subsequent commands:
260+
261+
```bash
262+
result=$( some_long_running_calculation )
263+
first_command <<< "$result"
264+
second_command <<< "$result"
265+
```
266+
267+
Here's a real-world application of that example:
268+
269+
* capture the JSON response to a REST API query (that is paginated),
270+
* provide the JSON data to a jq program to parse the results and output that to a file, and then
271+
* provide the JSON data to another jq program to determine the URL of the next query.
272+
273+
```bash
274+
# initialize the output CSV file
275+
echo "ID,VALUE" > data.csv
276+
277+
url='https//example.com/api/query?page=1'
278+
279+
while true; do
280+
json=$( curl "$url" )
281+
282+
# convert the results part of the response into CSV
283+
jq -r '.results[] | [.id, .value] | @csv' <<< "$json"
284+
285+
# get the URL for the next page
286+
url=$( jq -r '.next_url // ""' <<< "$json" )
287+
if [[ "$url" == "" ]]; then
288+
break
289+
fi
290+
done >> data.csv
291+
```
292+
293+
Note the position of the output redirection.
294+
All output from the while loop will be appended to the file `data.csv`.
295+
296+
## Heredocs and Herestrings as Redirection
297+
298+
Because these are just forms of redirection, they can be combined with other redirection operations:
299+
300+
```bash
301+
cat <<< END_OF_TEXT > output.txt
302+
This is my important text.
303+
END_OF_TEXT
304+
305+
awk '...' <<< "$my_var" >> result.csv
306+
```
307+
308+
## In Summary
309+
310+
Here documents (or "heredocs") are a flexible and convenient way to manage multi-line input in Bash scripts.
311+
They simplify the process of embedding text and data directly within your scripts, making them more self-contained and easier to read.
312+
313+
Here strings (or "herestrings") are like here documents, but offer a simpler, more dynamic syntax.

0 commit comments

Comments
 (0)