KMeans clustering ANY documents
Read in your files if they’re all separate
filenames | text | |
---|---|---|
0 | fanfiction-harry-potter/10001898.txt | Prologue: The MissionDisclaimer: All character... |
1 | fanfiction-harry-potter/10004131.txt | BlackDisclaimer: I do not own Harry PotterAuth... |
2 | fanfiction-harry-potter/10004927.txt | Chapter 1"I'm pregnant.""""Mum please say some... |
3 | fanfiction-harry-potter/10007980.txt | Author's Note: Hey, just so you know, this is ... |
4 | fanfiction-harry-potter/10010343.txt | Disclaimer: I do not own Harry Potter and frie... |
Or read in your CSV with the text column if not
Vectorize your documents
What are the options when creating a TfidfVectorizer
?
Let’s think about:
- ngram_range: Do we just want single words? Or more?
(1,2)
is one- and two-word phrases, etc. - max_features: Can it make things faster?
1
and up - max_df: Should we ignore words that show up too often?
0.0
-1.0
for percent, OR an integer for absolute document counts - min_df: Should we ignore words that show up too little?
0.0
-1.0
for percent, OR an integer for absolute document counts - vocabulary: Only care about certain words
Also… how many documents do we have?
(1874, 2)
CPU times: user 4.2 s, sys: 85.9 ms, total: 4.28 s
Wall time: 4.4 s
able | actually | albus | arm | arms | ask | asked | away | bad | bed | ... | won | words | work | world | wouldn | yeah | year | years | yes | young | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.058290 | 0.000000 | 0.000000 | 0.000000 | ... | 0.026641 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.021998 | 0.161977 |
1 | 0.007969 | 0.015810 | 0.021035 | 0.000000 | 0.015247 | 0.008246 | 0.034870 | 0.054080 | 0.000000 | 0.014964 | ... | 0.000000 | 0.068611 | 0.007508 | 0.000000 | 0.021064 | 0.000000 | 0.028493 | 0.036743 | 0.032899 | 0.088822 |
2 | 0.000000 | 0.030736 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.029091 | ... | 0.123930 | 0.000000 | 0.029191 | 0.000000 | 0.027299 | 0.000000 | 0.044313 | 0.000000 | 0.025583 | 0.094186 |
3 | 0.009575 | 0.000000 | 0.012638 | 0.019594 | 0.009161 | 0.039633 | 0.034917 | 0.006498 | 0.009812 | 0.000000 | ... | 0.000000 | 0.009161 | 0.009022 | 0.000000 | 0.000000 | 0.043148 | 0.102715 | 0.066227 | 0.039532 | 0.058217 |
4 | 0.023051 | 0.000000 | 0.000000 | 0.047170 | 0.022052 | 0.000000 | 0.050434 | 0.078219 | 0.023620 | 0.000000 | ... | 0.000000 | 0.022052 | 0.000000 | 0.060497 | 0.000000 | 0.000000 | 0.049454 | 0.053143 | 0.038067 | 0.256938 |
5 rows × 250 columns
…Try it without the TextBlob tokenizer
Cluster your documents
Fitting 2 clusters usinga (1874, 250) matrix
CPU times: user 18.5 s, sys: 148 ms, total: 18.7 s
Wall time: 19.3 s
See what they look like
Top terms per cluster:
Cluster 0: james lily said just like
Cluster 1: harry hermione ron draco said
Push the category back to the original dataframe
filenames | text | category | |
---|---|---|---|
0 | fanfiction-harry-potter/10001898.txt | Prologue: The MissionDisclaimer: All character... | 0 |
1 | fanfiction-harry-potter/10004131.txt | BlackDisclaimer: I do not own Harry PotterAuth... | 0 |
2 | fanfiction-harry-potter/10004927.txt | Chapter 1"I'm pregnant.""""Mum please say some... | 0 |
3 | fanfiction-harry-potter/10007980.txt | Author's Note: Hey, just so you know, this is ... | 1 |
4 | fanfiction-harry-potter/10010343.txt | Disclaimer: I do not own Harry Potter and frie... | 0 |
5 | fanfiction-harry-potter/10017757.txt | Disclaimer: I don't own any character in the H... | 0 |
6 | fanfiction-harry-potter/10018490.txt | DISCLAIMER: I don't own Harry Potter and its c... | 0 |
7 | fanfiction-harry-potter/10018889.txt | Katherine Rose-TylerChapter One: the Introduct... | 0 |
8 | fanfiction-harry-potter/10019142.txt | I am no longer that shy little boy anymore.I w... | 0 |
9 | fanfiction-harry-potter/10019987.txt | Happy New year! *throws confetti*I've really b... | 0 |
10 | fanfiction-harry-potter/10021604.txt | 2014"It's ridiculous." The red-headed boy shoo... | 1 |
11 | fanfiction-harry-potter/10023278.txt | Disclaimer: Did you really think I was J.K. Ro... | 0 |
12 | fanfiction-harry-potter/10023376.txt | This is my first story on fanfic and I'm nervo... | 0 |
13 | fanfiction-harry-potter/10026221.txt | DISCLAIMER: I don't own anything here that loo... | 0 |
14 | fanfiction-harry-potter/10035136.txt | A/N: So, this is my second ongoing story, and ... | 0 |
15 | fanfiction-harry-potter/10036003.txt | Disclaimer: I do not own Harry Potter. Enjoy t... | 0 |
16 | fanfiction-harry-potter/10037071.txt | For my friend, constant cheerleader and talent... | 0 |
17 | fanfiction-harry-potter/10038079.txt | Disclaimer: Harry Potter's not mineA/N:Another... | 0 |
18 | fanfiction-harry-potter/10038493.txt | Lily Potter was quite happy. Her favourite bro... | 0 |
19 | fanfiction-harry-potter/10041730.txt | A/N: This story follows all of canon besides f... | 1 |
20 | fanfiction-harry-potter/10043489.txt | It's 8th year at Hogwarts. Voldemort is dead. ... | 1 |
21 | fanfiction-harry-potter/10043782.txt | Hey everybody this is my first fic and will en... | 0 |
22 | fanfiction-harry-potter/10045762.txt | Prologue: The Puzzle That Wasn't Meant To Be S... | 1 |
23 | fanfiction-harry-potter/10049288.txt | Prologue REWRITTENA/N: This is a rewritten ver... | 0 |
24 | fanfiction-harry-potter/10050162.txt | I stood at the top of the Astronomy Tower thin... | 1 |
25 | fanfiction-harry-potter/10051779.txt | Title: Twin Dragon Heartstring Cores.Author: L... | 1 |
26 | fanfiction-harry-potter/10052973.txt | Chapter 1Walking down the corridor to potions,... | 1 |
27 | fanfiction-harry-potter/10053360.txt | These characters belong to J.K. Rowling, not m... | 0 |
28 | fanfiction-harry-potter/10055985.txt | A/N: Everyone needs their own version of a Sly... | 0 |
29 | fanfiction-harry-potter/10060250.txt | June 20th, 1992This was it. Dumbledore was sur... | 0 |
... | ... | ... | ... |
1844 | fanfiction-harry-potter/9920715.txt | Summary: Gryffindor isn't what Slytherin sees ... | 0 |
1845 | fanfiction-harry-potter/9925672.txt | Okay. Lumox. The title will be explained much ... | 0 |
1846 | fanfiction-harry-potter/9930431.txt | Normal Pov(A/N: Story takes place in Harry Pot... | 1 |
1847 | fanfiction-harry-potter/9933106.txt | (A/N: Rated M for mature content and language.... | 1 |
1848 | fanfiction-harry-potter/9933135.txt | Hey you guys, I'm back!I wasn't planning on wr... | 1 |
1849 | fanfiction-harry-potter/9940060.txt | Truly, Madly, Deeply, Crazily In Love#Chapter ... | 0 |
1850 | fanfiction-harry-potter/9942637.txt | She took a deep breath as she lay in her hospi... | 0 |
1851 | fanfiction-harry-potter/9944070.txt | Chapter 1Beta Read by optimisticrealist72Thoug... | 1 |
1852 | fanfiction-harry-potter/9944110.txt | "Now begin", Dumbledore's voice boomed over th... | 1 |
1853 | fanfiction-harry-potter/9944611.txt | Hermione's POVShe grabbed another handful of r... | 1 |
1854 | fanfiction-harry-potter/9944886.txt | I thought it would be fun to make one of these... | 0 |
1855 | fanfiction-harry-potter/9944944.txt | Harry`s POVHarry Potter, boy who lived, chosen... | 1 |
1856 | fanfiction-harry-potter/9945192.txt | Okay, so I've been wanting to write a George/H... | 0 |
1857 | fanfiction-harry-potter/9946564.txt | A/N: Adopted from unwrittenlegacy. This will h... | 1 |
1858 | fanfiction-harry-potter/9953756.txt | . . . Sorry . . . Introduction . . . Not the F... | 0 |
1859 | fanfiction-harry-potter/9954472.txt | "Why are we doing this again," Sirius asks gro... | 0 |
1860 | fanfiction-harry-potter/9957949.txt | I do not own any of the characters. You may ho... | 1 |
1861 | fanfiction-harry-potter/9964545.txt | Harry Potter, the Boy-Who-Lived, had always be... | 1 |
1862 | fanfiction-harry-potter/9966340.txt | Original Title: Dipping into the Dark SideOrig... | 0 |
1863 | fanfiction-harry-potter/9969981.txt | A/N: So I woke up the other day with this idea... | 0 |
1864 | fanfiction-harry-potter/9973627.txt | Author's Notes: In my own, happy little world,... | 1 |
1865 | fanfiction-harry-potter/9975438.txt | Title: Fighting For TomorrowRating: TSummary: ... | 1 |
1866 | fanfiction-harry-potter/9978757.txt | Bella Luna, beautiful moon. She wondered if he... | 0 |
1867 | fanfiction-harry-potter/9981541.txt | Authors note: Welcome to all the lovely reader... | 1 |
1868 | fanfiction-harry-potter/9981980.txt | AUTHOR'S NOTEHi everybody! This is my first fa... | 0 |
1869 | fanfiction-harry-potter/9984427.txt | I do not own Harry Potter and I only write for... | 1 |
1870 | fanfiction-harry-potter/9985697.txt | Hi everybody!Im so happy you clicked on this s... | 0 |
1871 | fanfiction-harry-potter/9988645.txt | I fell in love with this pairing and never loo... | 0 |
1872 | fanfiction-harry-potter/9992917.txt | Prologue: A Surrey StartEaster Saturday was pl... | 0 |
1873 | fanfiction-harry-potter/9993970.txt | "Okay, who's going first?" someone shouted ove... | 0 |
1874 rows × 3 columns
Be pleased
filenames | text | category | |
---|---|---|---|
3 | fanfiction-harry-potter/10007980.txt | Author's Note: Hey, just so you know, this is ... | 1 |
10 | fanfiction-harry-potter/10021604.txt | 2014"It's ridiculous." The red-headed boy shoo... | 1 |
19 | fanfiction-harry-potter/10041730.txt | A/N: This story follows all of canon besides f... | 1 |
20 | fanfiction-harry-potter/10043489.txt | It's 8th year at Hogwarts. Voldemort is dead. ... | 1 |
22 | fanfiction-harry-potter/10045762.txt | Prologue: The Puzzle That Wasn't Meant To Be S... | 1 |
24 | fanfiction-harry-potter/10050162.txt | I stood at the top of the Astronomy Tower thin... | 1 |
25 | fanfiction-harry-potter/10051779.txt | Title: Twin Dragon Heartstring Cores.Author: L... | 1 |
26 | fanfiction-harry-potter/10052973.txt | Chapter 1Walking down the corridor to potions,... | 1 |
30 | fanfiction-harry-potter/10061747.txt | The air whistles and blows through chestnut lo... | 1 |
31 | fanfiction-harry-potter/10061794.txt | Yay! Okay, now to clear this up. Story takes p... | 1 |
34 | fanfiction-harry-potter/10070079.txt | Disclaimer: JK Rowling owns Harry Potter, and ... | 1 |
35 | fanfiction-harry-potter/10073299.txt | Harry Potter And The Unexpected Second LifeAut... | 1 |
39 | fanfiction-harry-potter/10084022.txt | Ginny sat and thought, she didn't do much else... | 1 |
42 | fanfiction-harry-potter/10086764.txt | A/N: This is the second story I've ever writte... | 1 |
44 | fanfiction-harry-potter/10090975.txt | Sunlight poured through her eyelids, a dull re... | 1 |
45 | fanfiction-harry-potter/10092644.txt | It's over. The War. The death. The hurt. Every... | 1 |
51 | fanfiction-harry-potter/10097915.txt | Me: Hello everyone, how are you guys doing? It... | 1 |
53 | fanfiction-harry-potter/10098087.txt | Heyy! I've decided to write a Harry/Ginny fic,... | 1 |
58 | fanfiction-harry-potter/10107723.txt | LLL- Hi guyeseses. New Harry Potter story insp... | 1 |
59 | fanfiction-harry-potter/10108027.txt | Author's notes: Ta-dahhhh! After many years of... | 1 |
60 | fanfiction-harry-potter/10114060.txt | Author's Note: I was suddenly hit with a HP/LV... | 1 |
61 | fanfiction-harry-potter/10114197.txt | A/N Ok, so this is my first FanFic. I think I ... | 1 |
62 | fanfiction-harry-potter/10114274.txt | Disclaimer: Harry Potter and all characters b... | 1 |
67 | fanfiction-harry-potter/10124094.txt | PrologueThe nightmares were back.At first, it ... | 1 |
71 | fanfiction-harry-potter/10127234.txt | I do not own Harry Potter only the idea of thi... | 1 |
72 | fanfiction-harry-potter/10127513.txt | Hey, this is the final multi chapter part to t... | 1 |
75 | fanfiction-harry-potter/10134610.txt | AN: Not JK Rowling wish I was but I'm notI was... | 1 |
77 | fanfiction-harry-potter/10135379.txt | SWITCHA/N: Okay, so this is a new idea that at... | 1 |
82 | fanfiction-harry-potter/10138430.txt | Harry Dursley - Something's Wrong.Lets get the... | 1 |
85 | fanfiction-harry-potter/10142612.txt | A/N this is a Harry Potter fan fic based on th... | 1 |
... | ... | ... | ... |
1805 | fanfiction-harry-potter/9797364.txt | It was happening again. Night after night, she... | 1 |
1806 | fanfiction-harry-potter/9798300.txt | Alright, this is a rewrite of my fic 'I Can't ... | 1 |
1813 | fanfiction-harry-potter/9831689.txt | A/N: New story time! Yay!There are no propheci... | 1 |
1814 | fanfiction-harry-potter/9833380.txt | Sticky SituationA/N: Hollla! New story for y'a... | 1 |
1815 | fanfiction-harry-potter/9835824.txt | Warning this fic:- contains no slash/yaoi what... | 1 |
1818 | fanfiction-harry-potter/9839017.txt | Summary: Harry was left at the mercies of the ... | 1 |
1820 | fanfiction-harry-potter/9844547.txt | A/N: So this is my first story I've ever poste... | 1 |
1821 | fanfiction-harry-potter/9850472.txt | CHAPTER 1- MAKE NEW FRIENDS BUT KEEP THE OLD.A... | 1 |
1825 | fanfiction-harry-potter/9861833.txt | Title: BodyswitchDisclaimer: This story is bas... | 1 |
1826 | fanfiction-harry-potter/9863146.txt | Disclaimer: Harry Potter is owned by JK Rowlin... | 1 |
1828 | fanfiction-harry-potter/9873356.txt | A/N: A few months ago, I read an article on Mu... | 1 |
1832 | fanfiction-harry-potter/9880268.txt | Chapter one: Potions Masters ApprenticeIt was ... | 1 |
1836 | fanfiction-harry-potter/9886437.txt | Author's note: To anyone out there in fanfic w... | 1 |
1837 | fanfiction-harry-potter/9887402.txt | Chapter One: Goin' Back To Hogwarts"Are you su... | 1 |
1838 | fanfiction-harry-potter/9887587.txt | Disclaimer: J.K. Rowling owns em.His FamilyCha... | 1 |
1842 | fanfiction-harry-potter/9912956.txt | All things here belong to the beautiful mind o... | 1 |
1846 | fanfiction-harry-potter/9930431.txt | Normal Pov(A/N: Story takes place in Harry Pot... | 1 |
1847 | fanfiction-harry-potter/9933106.txt | (A/N: Rated M for mature content and language.... | 1 |
1848 | fanfiction-harry-potter/9933135.txt | Hey you guys, I'm back!I wasn't planning on wr... | 1 |
1851 | fanfiction-harry-potter/9944070.txt | Chapter 1Beta Read by optimisticrealist72Thoug... | 1 |
1852 | fanfiction-harry-potter/9944110.txt | "Now begin", Dumbledore's voice boomed over th... | 1 |
1853 | fanfiction-harry-potter/9944611.txt | Hermione's POVShe grabbed another handful of r... | 1 |
1855 | fanfiction-harry-potter/9944944.txt | Harry`s POVHarry Potter, boy who lived, chosen... | 1 |
1857 | fanfiction-harry-potter/9946564.txt | A/N: Adopted from unwrittenlegacy. This will h... | 1 |
1860 | fanfiction-harry-potter/9957949.txt | I do not own any of the characters. You may ho... | 1 |
1861 | fanfiction-harry-potter/9964545.txt | Harry Potter, the Boy-Who-Lived, had always be... | 1 |
1864 | fanfiction-harry-potter/9973627.txt | Author's Notes: In my own, happy little world,... | 1 |
1865 | fanfiction-harry-potter/9975438.txt | Title: Fighting For TomorrowRating: TSummary: ... | 1 |
1867 | fanfiction-harry-potter/9981541.txt | Authors note: Welcome to all the lovely reader... | 1 |
1869 | fanfiction-harry-potter/9984427.txt | I do not own Harry Potter and I only write for... | 1 |
723 rows × 3 columns
['said',
'thee',
'ye',
'after',
'mostly',
'my',
'whereafter',
'been',
'sincere',
'see',
'con',
'and',
'elsewhere',
'every',
'however',
'others',
'couldnt',
'made',
'over',
'such',
'since',
'became',
'any',
'else',
'below',
'these',
'without',
'against',
'amoungst',
'still',
'whole',
'we',
'all',
'or',
'full',
'am',
'have',
'together',
'across',
'into',
'hence',
'sometimes',
'thereafter',
'must',
'first',
'moreover',
'only',
'alone',
'than',
'empty',
'they',
'yourselves',
'myself',
'few',
'out',
'cry',
'ten',
'done',
'throughout',
'twenty',
'were',
'whither',
'also',
'much',
'here',
'except',
'sometime',
'everything',
'move',
'as',
'this',
'un',
'fire',
'him',
'an',
'nevertheless',
'latter',
'take',
'us',
'etc',
'along',
'some',
'off',
'beside',
'get',
'whence',
'anyone',
'seemed',
'towards',
'further',
'thus',
'back',
'interest',
'never',
'he',
'ltd',
'part',
'yours',
'no',
'seem',
'perhaps',
'should',
'whenever',
'neither',
'under',
'behind',
'therein',
'already',
'do',
'itself',
'nobody',
'hundred',
'her',
'your',
'beyond',
'is',
'who',
'would',
'fill',
'it',
'may',
'their',
'fifteen',
'himself',
'those',
'give',
'hereafter',
'them',
'was',
'enough',
'front',
'mill',
'due',
'most',
'of',
'show',
'please',
'nothing',
'top',
'third',
'sixty',
'go',
'twelve',
'anywhere',
'other',
'so',
'herein',
'indeed',
'none',
'side',
'because',
'bottom',
'thick',
'thin',
'put',
'the',
'not',
'from',
'per',
'seems',
'three',
'somewhere',
'again',
'thru',
'within',
'whereupon',
'once',
'bill',
'whereby',
'beforehand',
'around',
'although',
'if',
'often',
'upon',
'nor',
'many',
'to',
'she',
'up',
'least',
'hers',
'that',
'while',
're',
'you',
'hereby',
'ever',
'same',
'whom',
'anyway',
'yourself',
'whatever',
'becoming',
'down',
'eight',
'can',
'his',
'one',
'about',
'former',
'four',
'ourselves',
'hasnt',
'via',
'which',
'be',
'co',
'thence',
'either',
'where',
'serious',
'always',
'too',
'toward',
'becomes',
'someone',
'call',
'themselves',
'what',
'even',
'everyone',
'keep',
'name',
'thereupon',
'wherein',
'formerly',
'through',
'somehow',
'afterwards',
'almost',
'eleven',
'two',
'yet',
'could',
'on',
'has',
'i',
'whether',
'meanwhile',
'nowhere',
'found',
'but',
'describe',
'wherever',
'fifty',
'amount',
'now',
'rather',
'might',
'very',
'mine',
'by',
'forty',
'whose',
'less',
'noone',
'own',
'before',
'being',
'detail',
'de',
'between',
'me',
'are',
'onto',
'something',
'among',
'namely',
'though',
'thereby',
'in',
'latterly',
'there',
'seeming',
'hereupon',
'why',
'until',
'above',
'next',
'had',
'then',
'whoever',
'nine',
'six',
'well',
'both',
'during',
'ours',
'ie',
'several',
'find',
'amongst',
'become',
'a',
'eg',
'at',
'otherwise',
'each',
'whereas',
'inc',
'five',
'anyhow',
'how',
'anything',
'herself',
'its',
'our',
'therefore',
'cannot',
'when',
'cant',
'last',
'with',
'system',
'everywhere',
'another',
'will',
'more',
'for',
'besides']