I want to fill in the list_of_occurences with the correct item from the list grundformen.
My for-loop doesn't work as intended though. It doesn't restart from the beginning and only goes through the rows in the reader once. Therefore it won't fill the list completely.
This is what it prints (you can see the part where something is missing - because it doesn't start searching from the beginning of the list - ):
# List_of_occurrences (1 line - wrapped for easier reading)
[['NN', 1328, ('Ziel',)], ['ART', 771, ('der',)],
['$.', 732, ('_',)], ['VVFIN', 682, ('schlagen',)],
['PPER', 592, ('sie',)], ['$,', 561, ('_',)],
['ADV', 525, ('So',)], ['APPR', 507, ('in',)],
['NE', 433, ('Johanna',)], ['$(', 363, ('_',)],
['VAFIN', 334, ('haben',)], ['ADJA', 307, ('tragisch',)],
['ADJD', 278, ('recht',)], ['KON', 228, ('Doch',)],
['VVPP', 194, ('reichen',)], ['VVINF', 161, ('stören',)],
['KOUS', 151, ('Während',)], ['PPOSAT', 120, ('ihr',)],
['PTKVZ', 104, ('weiter',)], ['PRF', 98, ('sich',)],
['APPRART', 90, ('zu',)], ['PTKNEG', 87, ('nicht',)],
['VMFIN', 76, ('sollen',)], ['PIAT', 66, ('kein',)],
['PIS', 65, ('etwas',)], ['PTKZU', 52, ('zu',)],
['PRELS', 51, ('wer',)], ['PROAV', 42, ('dabei',)],
['PDS', 38, ('jener',)], ['PDAT', 37, ('dieser',)],
['PWAV', 30, ('wie',)], ['PWS', 26, ('Was',)],
['CARD', 24, ('drei',)], ['KOKOM', 21, ('wie',)],
['VAINF', 18, ('werden',)], ['KOUI', 15, ('um',)],
['VMINF', 10, ('können',)], ['VVIZU', 10, ('aufklären',)],
['VAPP', 10], ['PTKA', 6], ['PTKANT', 6], ['PWAT', 4],
['VVIMP', 4], ['PRELAT', 4], ['APZR', 3], ['APPO', 2],
['FM', 1]]
# Grundformen (1 line, wrapped for reading)
['Ziel', 'der', '_', 'schlagen', 'sie', '_', 'So', 'in', 'Johanna',
'_', 'haben', 'tragisch', 'recht', 'Doch', 'reichen', 'stören',
'Während', 'ihr', 'weiter', 'sich', 'zu', 'nicht', 'sollen', 'kein',
'etwas', 'zu', 'wer', 'dabei', 'jener', 'dieser', 'wie', 'Was',
'drei', 'wie', 'werden', 'um', 'können', 'aufklären']
occurences = collections.Counter()
with open("material-2.csv", mode='r', newline='', encoding="utf-8") as material:
reader = csv.reader(material, delimiter='\t', quotechar="\t")
for line in reader:
if line:
occurences[line[5]] += 1
else:
pass
list_of_occurences = [list(elem) for elem in occurences.most_common()]
grundformen = []
with open('material-2.csv', mode='r', newline='', encoding="utf-8") as material:
reader = csv.reader(material, delimiter='\t', quotechar="\t")
for elem in list_of_occurences:
for row in reader:
if row != [] and row[5] == elem[0]:
grundformen.append(row[2])
break
iterator = 0
for elem in grundformen:
list_of_occurences[iterator].insert(2, elem)
iterator = iterator + 1
pass
print(list_of_occurences)
print(grundformen)
whole inputfile: http://ift.tt/1CF67ed
Part of my input file:
1 Als Als _ _ KOUS _ _ 6 6 CP CP _ _ 2 es es _ _ PPER _ 3|Nom|Sg|Neut 6 6 SB SB _ _ 3 zu zu _ _ PTKA _ _ 4 4 MO MO _ _ 4 schneien schneien _ _ ADJD _ Comp|Dat|Sg|Fem 5 5 MO MO _ _ 5 aufgehört aufhören _ _ VVPP _ Psp 6 6 OC OC _ _ 6 hatte haben _ _ VAFIN _ 3|Sg|Past|Ind 8 8 MO MO _ _ 7 , _ _ _ $, _ _ 8 8 PUNC PUNC _ _ 8 verließ verlassen _ _ VVFIN _ 3|Sg|Past|Ind 0 0 ROOT ROOT _ _ 9 Johanna Johanna _ _ NE _ Nom|Sg|Masc 8 8 SB SB _ _ 10 von von _ _ APPR _ _ 5 5 SBP SBP _ _ 11 Rotenhoff Rotenhoff _ _ NE _ Dat|Sg|Neut 10 10 NK NK _ _ 12 , _ _ _ $, _ _ 8 8 PUNC PUNC _ _ 13 ohne ohne _ _ KOUI _ _ 18 18 CP CP _ _ 14 ein ein _ _ ART _ Nom|Sg|Neut 16 16 NK NK _ _ 15 rechtes recht _ _ ADJA _ Pos|Nom|Sg|Neut 16 16 NK NK _ _ 16 Ziel Ziel _ _ NN _ Nom|Sg|Neut 18 18 OA OA _ _ 17 zu zu _ _ PTKZU _ _ 18 18 PM PM _ _ 18 haben haben _ _ VAINF _ Inf 8 8 MO MO _ _ 19 , _ _ _ $, _ _ 18 18 PUNC PUNC _ _ 20 das der _ _ ART _ Nom|Sg|Neut 21 21 NK NK _ _ 21 Gutshaus Gutshaus _ _ NN _ Nom|Sg|Neut 16 16 APP APP _ _ 22 . _ _ _ $. _ _ 8 8 PUNC PUNC _ _
how can I change my loop, so that it can fill in everything?
Aucun commentaire:
Enregistrer un commentaire