mirror of
https://github.com/flame-engine/flame.git
synced 2025-10-29 07:56:53 +08:00
The script has two purposes: remove orphan words alphabetize the files I set it up to run on GitHub action as a checker, but a --fix option is available as well for running locally. When running, I noticed that there are A LOT of orphaned words. At first, I thought that it might be the case that cSpell was missing words on our docs that were clearly used, which would be a HUGE issue. I made this PR to validate that: #2735 But upon proper investigation, and using cSpell's trace command, I realized that we import multiple standard dictionaries: "en_US" and "softwareTerms", and they are constantly being updated. The word "cypher" was just added 12 hours ago, for example. Turns out ALL of the current orphan words are properly being detected on our files, but now are included on the official dictionaries! Which is amazing. Note that I did have to stop using the GitHub Action to run cSpell. The reason is twofold; (1) because I need to install cSpell anyway to run my script and didn't want to have the action download it again; and (2) because the version on the GitHub Action (even though it is the same 7.3.7 from npm that I have locally) doesn't have the latest updates (like does not have the cypher word that was added 12h ago). This would make my script and the CI script incompatible.
76 lines
2.0 KiB
Bash
Executable File
76 lines
2.0 KiB
Bash
Executable File
#!/bin/bash
|
|
|
|
fix=$([[ "$*" == *--fix* ]] && echo true || echo false)
|
|
|
|
function sort_fn() {
|
|
sort --ignore-case -C
|
|
}
|
|
|
|
function sort_dictionary() {
|
|
local file="$1"
|
|
local tmp_file=$(mktemp)
|
|
|
|
head -n 1 "$file" > "$tmp_file"
|
|
tail -n +2 "$file" | sort_fn >> "$tmp_file"
|
|
mv "$tmp_file" "$file"
|
|
}
|
|
|
|
function delete_unused() {
|
|
local file="$1"
|
|
local word="$2"
|
|
|
|
perl -i -ne "print unless /^\s*${word}\s*([# ].*)?$/i" "$file"
|
|
}
|
|
|
|
function lowercase() {
|
|
tr 'A-Z' 'a-z'
|
|
}
|
|
|
|
word_list="word_list.tmp"
|
|
|
|
dictionary_dir=".github/.cspell"
|
|
tmp_dir=".cspell.tmp"
|
|
|
|
mv "$dictionary_dir" "$tmp_dir"
|
|
mkdir "$dictionary_dir"
|
|
for file in "$tmp_dir"/*; do
|
|
if [[ -f "$file" ]]; then
|
|
touch "$dictionary_dir/$(basename "$file")"
|
|
fi
|
|
done
|
|
cspell --dot --no-progress --unique --words-only "**/*.{md,dart}" | lowercase | sort -f > $word_list || exit 1
|
|
rm -r "$dictionary_dir"
|
|
mv "$tmp_dir" "$dictionary_dir"
|
|
|
|
error=0
|
|
for file in .github/.cspell/*.txt; do
|
|
echo "Processing dictionary '$file'..."
|
|
|
|
violation=$(awk '!/^#/' "$file" | sort_fn 2>&1 || true)
|
|
if [ -n "$violation" ]; then
|
|
echo "Error: The dictionary '$file' is not in alphabetical order. First violation: '$violation'" >&2
|
|
error=1
|
|
if $fix; then
|
|
echo "Fixing the dictionary '$file'"
|
|
sort_dictionary "$file"
|
|
fi
|
|
fi
|
|
|
|
while IFS= read -r line; do
|
|
# split the line by # to remove comments
|
|
word=$(echo "$line" | cut -d '#' -f 1 | xargs | lowercase) # xargs trims whitespace
|
|
|
|
# check if the word exists in the project
|
|
if [[ -n "$word" ]] && ! grep -wxF "$word" "$word_list" >/dev/null; then
|
|
echo "Error: The word '$word' in the dictionary '$file' is not needed." >&2
|
|
error=1
|
|
if $fix; then
|
|
echo "Fixing the dictionary '$file' with excess word $word"
|
|
delete_unused "$file" "$word"
|
|
fi
|
|
fi
|
|
done < "$file"
|
|
done
|
|
|
|
rm $word_list
|
|
exit $error |