To report on a personal selection of stand-alone modules which offer stop word lists.
Module names are read from data/module.list.ini, which is shipped with the distro. |
Each module's data has an indicator - 'include = Yes/No' - which makes it easy to edit & re-run. |
But, because each included module has a different mechanism for returning the list of words, their names are also hard-coded in the source code. |
Excluded modules are listed at the end of this report. |
Module
|
Version
|
1.02
|
Name | Package | Version | Word count |
1: Lingua::EN::StopWordList | Lingua::EN::StopWordList | 1.01 | 659 |
2: Lingua::EN::StopWords | Lingua::EN::Segmenter | 0.1 | 213 |
3: Lingua::StopWords | Lingua::StopWords | 0.09 | 174 |
Name | Package | Version | Word count |
Id | Lingua::EN::StopWordList | Lingua::EN::StopWords | Lingua::StopWords |
1 | a | a | a |
2 | a's | ||
3 | able | ||
4 | about | about | about |
5 | above | above | above |
6 | abroad | ||
7 | according | ||
8 | accordingly | ||
9 | across | across | |
10 | actually | ||
11 | adj | adj | |
12 | after | after | after |
13 | afterwards | ||
14 | again | again | again |
15 | against | against | against |
16 | ago | ||
17 | ahead | ||
18 | ain't | ||
19 | all | all | all |
20 | allow | ||
21 | allows | ||
22 | almost | almost | |
23 | alone | alone | |
24 | along | along | |
25 | alongside | ||
26 | already | ||
27 | also | also | |
28 | although | although | |
29 | always | always | |
30 | am | am | am |
31 | amid | ||
32 | amidst | ||
33 | among | among | |
34 | amongst | ||
35 | an | an | an |
36 | and | and | and |
37 | another | another | |
38 | any | any | any |
39 | anybody | anybody | |
40 | anyhow | ||
41 | anyone | anyone | |
42 | anything | anything | |
43 | anyway | ||
44 | anyways | ||
45 | anywhere | anywhere | |
46 | apart | apart | |
47 | appear | ||
48 | appreciate | ||
49 | appropriate | ||
50 | are | are | are |
51 | aren't | aren't | |
52 | around | around | |
53 | as | as | as |
54 | aside | aside | |
55 | ask | ||
56 | asking | ||
57 | associated | ||
58 | at | at | at |
59 | available | ||
60 | away | away | |
61 | awfully | ||
62 | b | ||
63 | back | ||
64 | backward | ||
65 | backwards | ||
66 | be | be | be |
67 | became | ||
68 | because | because | because |
69 | become | ||
70 | becomes | ||
71 | becoming | ||
72 | been | been | been |
73 | before | before | before |
74 | beforehand | ||
75 | begin | ||
76 | behind | behind | |
77 | being | being | being |
78 | believe | ||
79 | below | below | below |
80 | beside | ||
81 | besides | besides | |
82 | best | ||
83 | better | ||
84 | between | between | between |
85 | beyond | beyond | |
86 | both | both | both |
87 | brief | ||
88 | but | but | but |
89 | by | by | by |
90 | c | ||
91 | c'mon | ||
92 | c's | ||
93 | came | ||
94 | can | can | |
95 | can't | can't | |
96 | cannot | cannot | cannot |
97 | cant | ||
98 | caption | ||
99 | cause | ||
100 | causes | ||
101 | certain | ||
102 | certainly | ||
103 | changes | ||
104 | clearly | ||
105 | co | ||
106 | co. | ||
107 | com | ||
108 | come | ||
109 | comes | ||
110 | concerning | ||
111 | consequently | ||
112 | consider | ||
113 | considering | ||
114 | contain | ||
115 | containing | ||
116 | contains | ||
117 | corresponding | ||
118 | could | could | could |
119 | couldn't | couldn't | |
120 | course | ||
121 | currently | ||
122 | d | ||
123 | dare | ||
124 | daren't | ||
125 | deep | ||
126 | definitely | ||
127 | described | ||
128 | despite | ||
129 | did | did | did |
130 | didn't | didn't | |
131 | different | ||
132 | directly | ||
133 | do | do | do |
134 | does | does | does |
135 | doesn't | doesn't | |
136 | doing | doing | doing |
137 | don't | don't | |
138 | done | done | |
139 | down | down | down |
140 | downwards | downwards | |
141 | during | during | during |
142 | e | ||
143 | each | each | each |
144 | edu | ||
145 | eg | ||
146 | eight | ||
147 | eighty | ||
148 | either | either | |
149 | else | else | |
150 | elsewhere | ||
151 | end | ||
152 | ending | ||
153 | enough | enough | |
154 | entirely | ||
155 | especially | ||
156 | et | ||
157 | etc | etc | |
158 | even | even | |
159 | ever | ever | |
160 | evermore | ||
161 | every | every | |
162 | everybody | everybody | |
163 | everyone | everyone | |
164 | everything | ||
165 | everywhere | ||
166 | ex | ||
167 | exactly | ||
168 | example | ||
169 | except | except | |
170 | f | ||
171 | fairly | ||
172 | far | far | |
173 | farther | ||
174 | few | few | few |
175 | fewer | ||
176 | fifth | ||
177 | first | ||
178 | five | ||
179 | followed | ||
180 | following | ||
181 | follows | ||
182 | for | for | for |
183 | forever | ||
184 | former | ||
185 | formerly | ||
186 | forth | forth | |
187 | forward | ||
188 | found | ||
189 | four | ||
190 | from | from | from |
191 | further | further | |
192 | furthermore | ||
193 | g | ||
194 | get | get | |
195 | gets | gets | |
196 | getting | ||
197 | given | ||
198 | gives | ||
199 | go | ||
200 | goes | ||
201 | going | ||
202 | gone | ||
203 | got | got | |
204 | gotten | ||
205 | greetings | ||
206 | h | ||
207 | had | had | had |
208 | hadn't | hadn't | |
209 | half | ||
210 | happens | ||
211 | hardly | hardly | |
212 | has | has | has |
213 | hasn't | hasn't | |
214 | have | have | have |
215 | haven't | haven't | |
216 | having | having | having |
217 | he | he | |
218 | he'd | he'd | |
219 | he'll | he'll | |
220 | he's | he's | |
221 | hello | ||
222 | help | ||
223 | hence | ||
224 | her | her | her |
225 | here | here | here |
226 | here's | here's | |
227 | hereafter | ||
228 | hereby | ||
229 | herein | ||
230 | hereupon | ||
231 | hers | hers | |
232 | herself | herself | herself |
233 | hi | ||
234 | him | him | him |
235 | himself | himself | himself |
236 | his | his | his |
237 | hither | ||
238 | hopefully | ||
239 | how | how | how |
240 | how's | ||
241 | howbeit | ||
242 | however | however | |
243 | hundred | ||
244 | i | i | i |
245 | i'd | i'd | |
246 | i'll | i'll | |
247 | i'm | i'm | |
248 | i've | i've | |
249 | ie | ||
250 | if | if | if |
251 | ignored | ||
252 | immediate | ||
253 | in | in | in |
254 | inasmuch | ||
255 | inc | ||
256 | inc. | ||
257 | indeed | indeed | |
258 | indicate | ||
259 | indicated | ||
260 | indicates | ||
261 | inner | ||
262 | inside | ||
263 | insofar | ||
264 | instead | instead | |
265 | into | into | into |
266 | inward | inward | |
267 | is | is | is |
268 | isn't | isn't | |
269 | it | it | it |
270 | it'd | ||
271 | it'll | ||
272 | it's | it's | |
273 | its | its | its |
274 | itself | itself | itself |
275 | j | ||
276 | just | just | |
277 | k | ||
278 | keep | ||
279 | keeps | ||
280 | kept | kept | |
281 | know | ||
282 | known | ||
283 | knows | ||
284 | l | ||
285 | last | ||
286 | lately | ||
287 | later | ||
288 | latter | ||
289 | latterly | ||
290 | least | ||
291 | less | ||
292 | lest | ||
293 | let | ||
294 | let's | let's | |
295 | like | ||
296 | liked | ||
297 | likely | ||
298 | likewise | ||
299 | little | ||
300 | look | ||
301 | looking | ||
302 | looks | ||
303 | low | ||
304 | lower | ||
305 | ltd | ||
306 | m | ||
307 | made | ||
308 | mainly | ||
309 | make | ||
310 | makes | ||
311 | many | many | |
312 | may | ||
313 | maybe | maybe | |
314 | mayn't | ||
315 | me | me | |
316 | mean | ||
317 | meantime | ||
318 | meanwhile | ||
319 | merely | ||
320 | might | might | |
321 | mightn't | ||
322 | mine | mine | |
323 | minus | ||
324 | miss | ||
325 | more | more | more |
326 | moreover | ||
327 | most | most | most |
328 | mostly | mostly | |
329 | mr | ||
330 | mrs | ||
331 | much | much | |
332 | must | must | |
333 | mustn't | mustn't | |
334 | my | my | |
335 | myself | myself | myself |
336 | n | ||
337 | name | ||
338 | namely | ||
339 | nd | ||
340 | near | near | |
341 | nearly | ||
342 | necessary | ||
343 | need | ||
344 | needn't | ||
345 | needs | ||
346 | neither | neither | |
347 | never | ||
348 | neverf | ||
349 | neverless | ||
350 | nevertheless | ||
351 | new | ||
352 | next | next | |
353 | nine | ||
354 | ninety | ||
355 | no | no | no |
356 | no-one | ||
357 | nobody | nobody | |
358 | non | ||
359 | none | none | |
360 | nonetheless | ||
361 | noone | ||
362 | nor | nor | nor |
363 | normally | ||
364 | not | not | not |
365 | nothing | nothing | |
366 | notwithstanding | ||
367 | novel | ||
368 | now | ||
369 | nowhere | nowhere | |
370 | o | ||
371 | obviously | ||
372 | of | of | of |
373 | off | off | off |
374 | often | often | |
375 | oh | ||
376 | ok | ||
377 | okay | ||
378 | old | ||
379 | on | on | on |
380 | once | once | |
381 | one | ||
382 | one's | ||
383 | ones | ||
384 | only | only | only |
385 | onto | onto | |
386 | opposite | ||
387 | or | or | or |
388 | other | other | other |
389 | others | others | |
390 | otherwise | ||
391 | ought | ought | ought |
392 | oughtn't | ||
393 | our | our | our |
394 | ours | ours | ours |
395 | ourselves | ourselves | |
396 | out | out | out |
397 | outside | outside | |
398 | over | over | over |
399 | overall | ||
400 | own | own | own |
401 | p | p | |
402 | particular | ||
403 | particularly | ||
404 | past | ||
405 | per | per | |
406 | perhaps | ||
407 | placed | ||
408 | please | please | |
409 | plus | plus | |
410 | possible | ||
411 | pp | ||
412 | presumably | ||
413 | probably | ||
414 | provided | ||
415 | provides | ||
416 | q | ||
417 | que | ||
418 | quite | quite | |
419 | qv | ||
420 | r | ||
421 | rather | rather | |
422 | rd | ||
423 | re | ||
424 | really | really | |
425 | reasonably | ||
426 | recent | ||
427 | recently | ||
428 | regarding | ||
429 | regardless | ||
430 | regards | ||
431 | relatively | ||
432 | respectively | ||
433 | right | ||
434 | round | ||
435 | s | ||
436 | said | said | |
437 | same | same | |
438 | saw | ||
439 | say | ||
440 | saying | ||
441 | says | ||
442 | second | ||
443 | secondly | ||
444 | see | ||
445 | seeing | ||
446 | seem | seem | |
447 | seemed | ||
448 | seeming | ||
449 | seems | ||
450 | seen | ||
451 | self | self | |
452 | selves | selves | |
453 | sensible | ||
454 | sent | ||
455 | serious | ||
456 | seriously | ||
457 | seven | ||
458 | several | several | |
459 | shall | shall | |
460 | shan't | shan't | |
461 | she | she | she |
462 | she'd | she'd | |
463 | she'll | she'll | |
464 | she's | she's | |
465 | should | should | should |
466 | shouldn't | shouldn't | |
467 | since | since | |
468 | six | ||
469 | so | so | so |
470 | some | some | some |
471 | somebody | somebody | |
472 | someday | ||
473 | somehow | ||
474 | someone | ||
475 | something | ||
476 | sometime | ||
477 | sometimes | ||
478 | somewhat | somewhat | |
479 | somewhere | ||
480 | soon | ||
481 | sorry | ||
482 | specified | ||
483 | specify | ||
484 | specifying | ||
485 | still | still | |
486 | sub | ||
487 | such | such | such |
488 | sup | ||
489 | sure | ||
490 | t | ||
491 | t's | ||
492 | take | ||
493 | taken | ||
494 | taking | ||
495 | tell | ||
496 | tends | ||
497 | th | ||
498 | than | than | than |
499 | thank | ||
500 | thanks | ||
501 | thanx | ||
502 | that | that | that |
503 | that'll | ||
504 | that's | that's | |
505 | that've | ||
506 | thats | ||
507 | the | the | the |
508 | their | their | their |
509 | theirs | theirs | theirs |
510 | them | them | them |
511 | themselves | themselves | themselves |
512 | then | then | then |
513 | thence | ||
514 | there | there | there |
515 | there'd | ||
516 | there'll | ||
517 | there're | ||
518 | there's | there's | |
519 | there've | ||
520 | thereafter | ||
521 | thereby | ||
522 | therefore | therefore | |
523 | therein | ||
524 | theres | ||
525 | thereupon | ||
526 | these | these | these |
527 | they | they | they |
528 | they'd | they'd | |
529 | they'll | they'll | |
530 | they're | they're | |
531 | they've | they've | |
532 | thing | ||
533 | things | ||
534 | think | ||
535 | third | ||
536 | thirty | ||
537 | this | this | this |
538 | thorough | thorough | |
539 | thoroughly | thoroughly | |
540 | those | those | those |
541 | though | ||
542 | three | ||
543 | through | through | through |
544 | throughout | ||
545 | thru | ||
546 | thus | thus | |
547 | till | ||
548 | to | to | to |
549 | together | together | |
550 | too | too | too |
551 | took | ||
552 | toward | toward | |
553 | towards | towards | |
554 | tried | ||
555 | tries | ||
556 | truly | ||
557 | try | ||
558 | trying | ||
559 | twice | ||
560 | two | ||
561 | u | ||
562 | un | ||
563 | under | under | under |
564 | underneath | ||
565 | undoing | ||
566 | unfortunately | ||
567 | unless | ||
568 | unlike | ||
569 | unlikely | ||
570 | until | until | until |
571 | unto | ||
572 | up | up | up |
573 | upon | upon | |
574 | upwards | ||
575 | us | ||
576 | use | ||
577 | used | ||
578 | useful | ||
579 | uses | ||
580 | using | ||
581 | usually | ||
582 | v | v | |
583 | value | ||
584 | various | ||
585 | versus | ||
586 | very | very | very |
587 | via | ||
588 | viz | ||
589 | vs | ||
590 | w | ||
591 | want | ||
592 | wants | ||
593 | was | was | was |
594 | wasn't | wasn't | |
595 | way | ||
596 | we | we | |
597 | we'd | we'd | |
598 | we'll | we'll | |
599 | we're | we're | |
600 | we've | we've | |
601 | welcome | ||
602 | well | well | |
603 | went | ||
604 | were | were | were |
605 | weren't | weren't | |
606 | what | what | what |
607 | what'll | ||
608 | what's | what's | |
609 | what've | ||
610 | whatever | whatever | |
611 | when | when | when |
612 | when's | ||
613 | whence | ||
614 | whenever | whenever | |
615 | where | where | where |
616 | where's | where's | |
617 | whereafter | ||
618 | whereas | ||
619 | whereby | ||
620 | wherein | ||
621 | whereupon | ||
622 | wherever | ||
623 | whether | whether | |
624 | which | which | which |
625 | whichever | ||
626 | while | while | while |
627 | whilst | ||
628 | whither | ||
629 | who | who | who |
630 | who'd | ||
631 | who'll | ||
632 | who's | who's | |
633 | whoever | ||
634 | whole | ||
635 | whom | whom | whom |
636 | whomever | ||
637 | whose | whose | |
638 | why | why | |
639 | why's | ||
640 | will | will | |
641 | willing | ||
642 | wish | ||
643 | with | with | with |
644 | within | within | |
645 | without | without | |
646 | won't | won't | |
647 | wonder | ||
648 | would | would | would |
649 | wouldn't | wouldn't | |
650 | x | ||
651 | y | ||
652 | yes | ||
653 | yet | yet | |
654 | you | you | |
655 | you'd | you'd | |
656 | you'll | you'll | |
657 | you're | you're | |
658 | you've | you've | |
659 | young | ||
660 | your | your | your |
661 | yours | yours | |
662 | yourself | yourself | yourself |
663 | yourselves | yourselves | |
664 | z | ||
665 | zero | ||
Id | Lingua::EN::StopWordList | Lingua::EN::StopWords | Lingua::StopWords |
Name | Package | Notes |
1: AI::Categorizer::Document | AI::Categorizer | Not stand-alone. User may provide a stopword list |
2: Blog::Spam::Plugin::stopwords | Blog::Spam | Not stand-alone. Uses hard-coded path to stopword file |
3: Combine::Matcher | combine | Not stand-alone. User must provide the stopwords in an (apparently) undocumented fashion |
4: DBIx::FullTextSearch::StopList | DBIx::FullTextSearch | Not stand-alone |
5: DBIx::TextIndex::StopList::cz | DBIx::TextIndex | Czech-language stop words |
6: Elastic::Manual::Analysis | Elastic::Model | Not stand-alone. User may provide a stopword list |
7: HTML::Index::Store | HTML::Index | Not stand-alone. User may provide a stopword list |
8: Image::WordCloud::StopWords::EN | Image::WordCloud | Not stand-alone |
9: KinoSearch1::Analysis::Stopalizer | KinoSearch1 | Not stand-alone |
10: KinoSearch::Analysis::Stopalizer | KinoSearch | Not stand-alone |
11: Lucy::Analysis::SnowballStopFilter | Lucy | Not stand-alone. Supports 13 languages |
12: Perl::Critic::Policy::Documentation::PodSpelling | Perl::Critic | Not stand-alone. Uses Pod::Spell |
13: Pod::Weaver::Plugin::StopWords | Pod::Weaver | Not stand-alone. User may provide a stopword list |
14: Pod::Wordlist | Pod::Spell | Not stand-alone. Built-in stopword list is Perl-specific |
15: Search::Glimpse::Index | Search::Glimpse | Not stand-alone. Also, requires a Glimpse server |
16: Search::Indexer::Incremental::MD5 | Search::Indexer::Incremental::MD5 | Not stand-alone. User may provide a stopword list, or use a built-in Perl-specific list |
17: Search::Tokenizer | Search::Tokenizer | Has a option to accept the Lingua::StopWords list |
18: Search::Tools::QueryParser | Search::Tools | Not stand-alone. Use may provide a stopword list |
19: Test::Spelling | Test::Spelling | Perl-specific words via Pod::Spell. User may add words |
20: Text::DeDuper | Text::DeDuper | User may provide a stopword list |
21: Text::Language::Guess | Text::Language::Guess | Uses Lingua::Stopwords |
22: Text::Similarity::Overlaps | Text::Similarity | Not stand-alone. User must provide a stopword file |
23: UMLS::SenseRelate::TargetWord | UMLS::SenseRelate | Not stand-alone. Has option to disregard an (apparently) undocumented list of stopwords |
24: WAIT::Filter | WAIT | Apparently contains a built-in list of freeWAIS-sf stopwords |
25: WordNet-Similarity | WordNet-Similarity | Not standalone. User may provide a stopword file |
Name | Package | Notes |
Modules are excluded if they are not stand-alone, |
or if they require the user to supply the stopword list. |
Lastly, modules are excluded if they use one of the other modules listed in this report. |
Author
|
|
Date
|
2015-08-16
|
OS
|
Debian V 8.1
|
Perl
|
5.20.2
|