Purpose

To report on a personal selection of stand-alone modules which offer stop word lists.

Module names are read from data/module.list.ini, which is shipped with the distro.
Each module's data has an indicator - 'include = Yes/No' - which makes it easy to edit & re-run.
But, because each included module has a different mechanism for returning the list of words, their names are also hard-coded in the source code.
Excluded modules are listed at the end of this report.

Report generator

Module
Version
1.01

Modules included

Name Package Version Word count
1: Lingua::EN::StopWordList Lingua::EN::StopWordList 1.00 659
2: Lingua::EN::StopWords Lingua::EN::Segmenter 0.1 213
3: Lingua::StopWords Lingua::StopWords 0.09 174
Name Package Version Word count

Words

Id Lingua::EN::StopWordList Lingua::EN::StopWords Lingua::StopWords
1 a a a
2 a's
3 able
4 about about about
5 above above above
6 abroad
7 according
8 accordingly
9 across across
10 actually
11 adj adj
12 after after after
13 afterwards
14 again again again
15 against against against
16 ago
17 ahead
18 ain't
19 all all all
20 allow
21 allows
22 almost almost
23 alone alone
24 along along
25 alongside
26 already
27 also also
28 although although
29 always always
30 am am am
31 amid
32 amidst
33 among among
34 amongst
35 an an an
36 and and and
37 another another
38 any any any
39 anybody anybody
40 anyhow
41 anyone anyone
42 anything anything
43 anyway
44 anyways
45 anywhere anywhere
46 apart apart
47 appear
48 appreciate
49 appropriate
50 are are are
51 aren't aren't
52 around around
53 as as as
54 aside aside
55 ask
56 asking
57 associated
58 at at at
59 available
60 away away
61 awfully
62 b
63 back
64 backward
65 backwards
66 be be be
67 became
68 because because because
69 become
70 becomes
71 becoming
72 been been been
73 before before before
74 beforehand
75 begin
76 behind behind
77 being being being
78 believe
79 below below below
80 beside
81 besides besides
82 best
83 better
84 between between between
85 beyond beyond
86 both both both
87 brief
88 but but but
89 by by by
90 c
91 c'mon
92 c's
93 came
94 can can
95 can't can't
96 cannot cannot cannot
97 cant
98 caption
99 cause
100 causes
101 certain
102 certainly
103 changes
104 clearly
105 co
106 co.
107 com
108 come
109 comes
110 concerning
111 consequently
112 consider
113 considering
114 contain
115 containing
116 contains
117 corresponding
118 could could could
119 couldn't couldn't
120 course
121 currently
122 d
123 dare
124 daren't
125 deep
126 definitely
127 described
128 despite
129 did did did
130 didn't didn't
131 different
132 directly
133 do do do
134 does does does
135 doesn't doesn't
136 doing doing doing
137 don't don't
138 done done
139 down down down
140 downwards downwards
141 during during during
142 e
143 each each each
144 edu
145 eg
146 eight
147 eighty
148 either either
149 else else
150 elsewhere
151 end
152 ending
153 enough enough
154 entirely
155 especially
156 et
157 etc etc
158 even even
159 ever ever
160 evermore
161 every every
162 everybody everybody
163 everyone everyone
164 everything
165 everywhere
166 ex
167 exactly
168 example
169 except except
170 f
171 fairly
172 far far
173 farther
174 few few few
175 fewer
176 fifth
177 first
178 five
179 followed
180 following
181 follows
182 for for for
183 forever
184 former
185 formerly
186 forth forth
187 forward
188 found
189 four
190 from from from
191 further further
192 furthermore
193 g
194 get get
195 gets gets
196 getting
197 given
198 gives
199 go
200 goes
201 going
202 gone
203 got got
204 gotten
205 greetings
206 h
207 had had had
208 hadn't hadn't
209 half
210 happens
211 hardly hardly
212 has has has
213 hasn't hasn't
214 have have have
215 haven't haven't
216 having having having
217 he he
218 he'd he'd
219 he'll he'll
220 he's he's
221 hello
222 help
223 hence
224 her her her
225 here here here
226 here's here's
227 hereafter
228 hereby
229 herein
230 hereupon
231 hers hers
232 herself herself herself
233 hi
234 him him him
235 himself himself himself
236 his his his
237 hither
238 hopefully
239 how how how
240 how's
241 howbeit
242 however however
243 hundred
244 i i i
245 i'd i'd
246 i'll i'll
247 i'm i'm
248 i've i've
249 ie
250 if if if
251 ignored
252 immediate
253 in in in
254 inasmuch
255 inc
256 inc.
257 indeed indeed
258 indicate
259 indicated
260 indicates
261 inner
262 inside
263 insofar
264 instead instead
265 into into into
266 inward inward
267 is is is
268 isn't isn't
269 it it it
270 it'd
271 it'll
272 it's it's
273 its its its
274 itself itself itself
275 j
276 just just
277 k
278 keep
279 keeps
280 kept kept
281 know
282 known
283 knows
284 l
285 last
286 lately
287 later
288 latter
289 latterly
290 least
291 less
292 lest
293 let
294 let's let's
295 like
296 liked
297 likely
298 likewise
299 little
300 look
301 looking
302 looks
303 low
304 lower
305 ltd
306 m
307 made
308 mainly
309 make
310 makes
311 many many
312 may
313 maybe maybe
314 mayn't
315 me me
316 mean
317 meantime
318 meanwhile
319 merely
320 might might
321 mightn't
322 mine mine
323 minus
324 miss
325 more more more
326 moreover
327 most most most
328 mostly mostly
329 mr
330 mrs
331 much much
332 must must
333 mustn't mustn't
334 my my
335 myself myself myself
336 n
337 name
338 namely
339 nd
340 near near
341 nearly
342 necessary
343 need
344 needn't
345 needs
346 neither neither
347 never
348 neverf
349 neverless
350 nevertheless
351 new
352 next next
353 nine
354 ninety
355 no no no
356 no-one
357 nobody nobody
358 non
359 none none
360 nonetheless
361 noone
362 nor nor nor
363 normally
364 not not not
365 nothing nothing
366 notwithstanding
367 novel
368 now
369 nowhere nowhere
370 o
371 obviously
372 of of of
373 off off off
374 often often
375 oh
376 ok
377 okay
378 old
379 on on on
380 once once
381 one
382 one's
383 ones
384 only only only
385 onto onto
386 opposite
387 or or or
388 other other other
389 others others
390 otherwise
391 ought ought ought
392 oughtn't
393 our our our
394 ours ours ours
395 ourselves ourselves
396 out out out
397 outside outside
398 over over over
399 overall
400 own own own
401 p p
402 particular
403 particularly
404 past
405 per per
406 perhaps
407 placed
408 please please
409 plus plus
410 possible
411 pp
412 presumably
413 probably
414 provided
415 provides
416 q
417 que
418 quite quite
419 qv
420 r
421 rather rather
422 rd
423 re
424 really really
425 reasonably
426 recent
427 recently
428 regarding
429 regardless
430 regards
431 relatively
432 respectively
433 right
434 round
435 s
436 said said
437 same same
438 saw
439 say
440 saying
441 says
442 second
443 secondly
444 see
445 seeing
446 seem seem
447 seemed
448 seeming
449 seems
450 seen
451 self self
452 selves selves
453 sensible
454 sent
455 serious
456 seriously
457 seven
458 several several
459 shall shall
460 shan't shan't
461 she she she
462 she'd she'd
463 she'll she'll
464 she's she's
465 should should should
466 shouldn't shouldn't
467 since since
468 six
469 so so so
470 some some some
471 somebody somebody
472 someday
473 somehow
474 someone
475 something
476 sometime
477 sometimes
478 somewhat somewhat
479 somewhere
480 soon
481 sorry
482 specified
483 specify
484 specifying
485 still still
486 sub
487 such such such
488 sup
489 sure
490 t
491 t's
492 take
493 taken
494 taking
495 tell
496 tends
497 th
498 than than than
499 thank
500 thanks
501 thanx
502 that that that
503 that'll
504 that's that's
505 that've
506 thats
507 the the the
508 their their their
509 theirs theirs theirs
510 them them them
511 themselves themselves themselves
512 then then then
513 thence
514 there there there
515 there'd
516 there'll
517 there're
518 there's there's
519 there've
520 thereafter
521 thereby
522 therefore therefore
523 therein
524 theres
525 thereupon
526 these these these
527 they they they
528 they'd they'd
529 they'll they'll
530 they're they're
531 they've they've
532 thing
533 things
534 think
535 third
536 thirty
537 this this this
538 thorough thorough
539 thoroughly thoroughly
540 those those those
541 though
542 three
543 through through through
544 throughout
545 thru
546 thus thus
547 till
548 to to to
549 together together
550 too too too
551 took
552 toward toward
553 towards towards
554 tried
555 tries
556 truly
557 try
558 trying
559 twice
560 two
561 u
562 un
563 under under under
564 underneath
565 undoing
566 unfortunately
567 unless
568 unlike
569 unlikely
570 until until until
571 unto
572 up up up
573 upon upon
574 upwards
575 us
576 use
577 used
578 useful
579 uses
580 using
581 usually
582 v v
583 value
584 various
585 versus
586 very very very
587 via
588 viz
589 vs
590 w
591 want
592 wants
593 was was was
594 wasn't wasn't
595 way
596 we we
597 we'd we'd
598 we'll we'll
599 we're we're
600 we've we've
601 welcome
602 well well
603 went
604 were were were
605 weren't weren't
606 what what what
607 what'll
608 what's what's
609 what've
610 whatever whatever
611 when when when
612 when's
613 whence
614 whenever whenever
615 where where where
616 where's where's
617 whereafter
618 whereas
619 whereby
620 wherein
621 whereupon
622 wherever
623 whether whether
624 which which which
625 whichever
626 while while while
627 whilst
628 whither
629 who who who
630 who'd
631 who'll
632 who's who's
633 whoever
634 whole
635 whom whom whom
636 whomever
637 whose whose
638 why why
639 why's
640 will will
641 willing
642 wish
643 with with with
644 within within
645 without without
646 won't won't
647 wonder
648 would would would
649 wouldn't wouldn't
650 x
651 y
652 yes
653 yet yet
654 you you
655 you'd you'd
656 you'll you'll
657 you're you're
658 you've you've
659 young
660 your your your
661 yours yours
662 yourself yourself yourself
663 yourselves yourselves
664 z
665 zero
Id Lingua::EN::StopWordList Lingua::EN::StopWords Lingua::StopWords

Modules excluded

Name Package Notes
1: AI::Categorizer::Document AI::Categorizer Not stand-alone. User may provide a stopword list
2: Blog::Spam::Plugin::stopwords Blog::Spam Not stand-alone. Uses hard-coded path to stopword file
3: Combine::Matcher combine Not stand-alone. User must provide the stopwords in an (apparently) undocumented fashion
4: DBIx::FullTextSearch::StopList DBIx::FullTextSearch Not stand-alone
5: DBIx::TextIndex::StopList::cz DBIx::TextIndex Czech-language stop words
6: Elastic::Manual::Analysis Elastic::Model Not stand-alone. User may provide a stopword list
7: HTML::Index::Store HTML::Index Not stand-alone. User may provide a stopword list
8: Image::WordCloud::StopWords::EN Image::WordCloud Not stand-alone
9: KinoSearch1::Analysis::Stopalizer KinoSearch1 Not stand-alone
10: KinoSearch::Analysis::Stopalizer KinoSearch Not stand-alone
11: Lucy::Analysis::SnowballStopFilter Lucy Not stand-alone. Supports 13 languages
12: Perl::Critic::Policy::Documentation::PodSpelling Perl::Critic Not stand-alone. Uses Pod::Spell
13: Pod::Weaver::Plugin::StopWords Pod::Weaver Not stand-alone. User may provide a stopword list
14: Pod::Wordlist Pod::Spell Not stand-alone. Built-in stopword list is Perl-specific
15: Search::Glimpse::Index Search::Glimpse Not stand-alone. Also, requires a Glimpse server
16: Search::Indexer::Incremental::MD5 Search::Indexer::Incremental::MD5 Not stand-alone. User may provide a stopword list, or use a built-in Perl-specific list
17: Search::Tokenizer Search::Tokenizer Has a option to accept the Lingua::StopWords list
18: Search::Tools::QueryParser Search::Tools Not stand-alone. Use may provide a stopword list
19: Test::Spelling Test::Spelling Perl-specific words via Pod::Spell. User may add words
20: Text::DeDuper Text::DeDuper User may provide a stopword list
21: Text::Language::Guess Text::Language::Guess Uses Lingua::Stopwords
22: Text::Similarity::Overlaps Text::Similarity Not stand-alone. User must provide a stopword file
23: UMLS::SenseRelate::TargetWord UMLS::SenseRelate Not stand-alone. Has option to disregard an (apparently) undocumented list of stopwords
24: WAIT::Filter WAIT Apparently contains a built-in list of freeWAIS-sf stopwords
25: WordNet-Similarity WordNet-Similarity Not standalone. User may provide a stopword file
Name Package Notes
Modules are excluded if they are not stand-alone,
or if they require the user to supply the stopword list.
Lastly, modules are excluded if they use one of the other modules listed in this report.

Environment

Author
Date
2012-08-20
OS
Debian V 6.0.4
Perl
5.14.2