Python / IPython奇怪的不可重现列表索引超出范围错误(Python/IPython strange non reproducible list index out of range error)
我最近学习了一些Python以及如何将它应用到我的工作中。 我已成功编写了几个脚本,但我遇到了一个我无法弄清楚的问题。
我打开的文件大约有4000行,每行有两个制表符分隔的列。 在读取输入文件时,我收到索引错误,指出列表索引超出范围。 然而,虽然我每次都得到错误,但每次都不会在同一行上发生错误(因为,每次都会在不同的行上抛出错误!)。 因此,由于某种原因,它通常有效,但随后(看似)随机失败。
因为我上周才开始学习Python,所以我很难过。 我已经四处寻找同样的问题,但没有找到类似的东西。 此外,我不知道这是一个特定于语言或IPython的问题。 任何帮助将不胜感激!
input = open("count.txt", "r") changelist = [] listtosort = [] second = str() output = open("output.txt", "w") for each in input: splits = each.split("\t") changelist = list(splits[0]) second = int(splits[1]) print second if changelist[7] == ";": changelist.insert(6, "000") va = "".join(changelist) var = va + ("\t") + str(second) listtosort.append(var) output.write(var) elif changelist[8] == ";": changelist.insert(6, "00") va = "".join(changelist) var = va + ("\t") + str(second) listtosort.append(var) output.write(var) elif changelist[9] == ";": changelist.insert(6, "0") va = "".join(changelist) var = va + ("\t") + str(second) listtosort.append(var) output.write(var) else: #output.write(str("".join(changelist))) va = "".join(changelist) var = va + ("\t") + str(second) listtosort.append(var) output.write(var) output.close()
错误
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) /home/a/Desktop/sharedfolder/ipytest/individ.ins.count.test/<ipython-input-87-32f9b0a1951b> in <module>() 57 splits = each.split("\t") 58 changelist = list(splits[0]) ---> 59 second = int(splits[1]) 60 61 print second IndexError: list index out of range
输入:
ID=cds0;Name=NP_414542.1;Parent=gene0;Dbxref=ASAP:ABE-0000006,UniProtKB%2FSwiss-Prot:P0AD86,Genbank:NP_414542.1,EcoGene:EG11277,GeneID:944742;gbkey=CDS;product=thr 12 ID=cds1000;Name=NP_415538.1;Parent=gene1035;Dbxref=ASAP:ABE-0003451,UniProtKB%2FSwiss-Prot:P31545,Genbank:NP_415538.1,EcoGene:EG11735,GeneID:946500;gbkey=CDS;product=deferrrochelatase%2C 50 ID=cds1001;Name=NP_415539.1;Parent=gene1036;Note=PhoB-dependent%2C 36
期望的输出:
ID=cds0000;Name=NP_414542.1;Parent=gene0;Dbxref=ASAP:ABE-0000006,UniProtKB%2FSwiss-Prot:P0AD86,Genbank:NP_414542.1,EcoGene:EG11277,GeneID:944742;gbkey=CDS;product=thr 12 ID=cds1000;Name=NP_415538.1;Parent=gene1035;Dbxref=ASAP:ABE-0003451,UniProtKB%2FSwiss-Prot:P31545,Genbank:NP_415538.1,EcoGene:EG11735,GeneID:946500;gbkey=CDS;product=deferrrochelatase%2C 50 ID=cds1001;Name=NP_415539.1;Parent=gene1036;Note=PhoB-dependent%2C 36
I have recently been learning some Python and how to apply it to my work. I have written a couple of scripts successfully, but I am having an issue I just cannot figure out.
I am opening a file with ~4000 lines, two tab separated columns per line. When reading the input file, I get an index error saying that the list index is out of range. However, while I get the error every time, it doesn't happen on the same line every time (as in, it will throw the error on different lines everytime!). So, for some reason, it works generally but then (seemingly) randomly fails.
As I literally only started learning Python last week, I am stumped. I have looked around for the same problem, but not found anything similar. Furthermore I don't know if this is a problem that is language specific or IPython specific. Any help would be greatly appreciated!
input = open("count.txt", "r") changelist = [] listtosort = [] second = str() output = open("output.txt", "w") for each in input: splits = each.split("\t") changelist = list(splits[0]) second = int(splits[1]) print second if changelist[7] == ";": changelist.insert(6, "000") va = "".join(changelist) var = va + ("\t") + str(second) listtosort.append(var) output.write(var) elif changelist[8] == ";": changelist.insert(6, "00") va = "".join(changelist) var = va + ("\t") + str(second) listtosort.append(var) output.write(var) elif changelist[9] == ";": changelist.insert(6, "0") va = "".join(changelist) var = va + ("\t") + str(second) listtosort.append(var) output.write(var) else: #output.write(str("".join(changelist))) va = "".join(changelist) var = va + ("\t") + str(second) listtosort.append(var) output.write(var) output.close()
The error
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) /home/a/Desktop/sharedfolder/ipytest/individ.ins.count.test/<ipython-input-87-32f9b0a1951b> in <module>() 57 splits = each.split("\t") 58 changelist = list(splits[0]) ---> 59 second = int(splits[1]) 60 61 print second IndexError: list index out of range
Input:
ID=cds0;Name=NP_414542.1;Parent=gene0;Dbxref=ASAP:ABE-0000006,UniProtKB%2FSwiss-Prot:P0AD86,Genbank:NP_414542.1,EcoGene:EG11277,GeneID:944742;gbkey=CDS;product=thr 12 ID=cds1000;Name=NP_415538.1;Parent=gene1035;Dbxref=ASAP:ABE-0003451,UniProtKB%2FSwiss-Prot:P31545,Genbank:NP_415538.1,EcoGene:EG11735,GeneID:946500;gbkey=CDS;product=deferrrochelatase%2C 50 ID=cds1001;Name=NP_415539.1;Parent=gene1036;Note=PhoB-dependent%2C 36
Desired output:
ID=cds0000;Name=NP_414542.1;Parent=gene0;Dbxref=ASAP:ABE-0000006,UniProtKB%2FSwiss-Prot:P0AD86,Genbank:NP_414542.1,EcoGene:EG11277,GeneID:944742;gbkey=CDS;product=thr 12 ID=cds1000;Name=NP_415538.1;Parent=gene1035;Dbxref=ASAP:ABE-0003451,UniProtKB%2FSwiss-Prot:P31545,Genbank:NP_415538.1,EcoGene:EG11735,GeneID:946500;gbkey=CDS;product=deferrrochelatase%2C 50 ID=cds1001;Name=NP_415539.1;Parent=gene1036;Note=PhoB-dependent%2C 36
满意答案
你得到
IndexError
的原因是你的输入文件显然不是完全用制表符分隔的。 这就是为什么当您尝试访问它时,splits[1]
没有任何内容。您的代码可以使用一些重构。 首先,你正在重复使用
if
-checks,这是不必要的。 这只是将cds0
到7个字符,这可能不是你想要的。 我将以下内容放在一起,以演示如何重构您的代码,使其变得更加pythonic和干燥。 我无法保证它能够与您的数据集一起使用,但我希望它可以帮助您了解如何以不同的方式执行操作。to_sort = [] # We can open two files using the with statement. This will also handle # closing the files for us, when we exit the block. with open("count.txt", "r") as inp, open("output.txt", "w") as out: for each in inp: # Split at ';'... So you won't have to worry about whether or not # the file is tab delimited changed = each.split(";") # Get the value you want. This is called unpacking. # The value before '=' will always be 'ID', so we don't really care about it. # _ is generally used as a variable name when the value is discarded. _, value = changed[0].split("=") # 0-pad the desired value to 7 characters. Python string formatting # makes this very easy. This will replace the current value in the list. changed[0] = "ID={:0<7}".format(value) # Join the changed-list with the original separator and # and append it to the sort list. to_sort.append(";".join(changed)) # Write the results to the file all at once. Your test data already # provided the newlines, you can just write it out as it is. output.writelines(to_sort) # Do what else you need to do. Maybe to_list.sort()?
您会注意到,此代码将代码减少到8行,但实现完全相同的事情,不会重复,并且很容易理解。
The reason you're getting the
IndexError
is that your input-file is apparently not entirely tab delimited. That's why there is nothing atsplits[1]
when you attempt to access it.Your code could use some refactoring. First of all you're repeating yourself with the
if
-checks, it's unnecessary. This just pads thecds0
to 7 characters which is probably not what you want. I threw the following together to demonstrate how you could refactor your code to be a little more pythonic and dry. I can't guarantee it'll work with your dataset, but I'm hoping it might help you understand how to do things differently.to_sort = [] # We can open two files using the with statement. This will also handle # closing the files for us, when we exit the block. with open("count.txt", "r") as inp, open("output.txt", "w") as out: for each in inp: # Split at ';'... So you won't have to worry about whether or not # the file is tab delimited changed = each.split(";") # Get the value you want. This is called unpacking. # The value before '=' will always be 'ID', so we don't really care about it. # _ is generally used as a variable name when the value is discarded. _, value = changed[0].split("=") # 0-pad the desired value to 7 characters. Python string formatting # makes this very easy. This will replace the current value in the list. changed[0] = "ID={:0<7}".format(value) # Join the changed-list with the original separator and # and append it to the sort list. to_sort.append(";".join(changed)) # Write the results to the file all at once. Your test data already # provided the newlines, you can just write it out as it is. output.writelines(to_sort) # Do what else you need to do. Maybe to_list.sort()?
You'll notice that this code is reduces your code down to 8 lines but achieves the exact same thing, does not repeat itself and is pretty easy to understand.
Please read the PEP8, the Zen of python, and go through the official tutorial.
相关问答
更多Wordnet synset - 奇怪的列表索引超出范围错误(Wordnet synset - strange list index out of range Error)
Python / IPython奇怪的不可重现列表索引超出范围错误(Python/IPython strange non reproducible list index out of range error)
列表索引超出Python 2的范围错误(list index out of range error with Python 2)
Python Pandas Index错误:列表索引超出范围(Python Pandas Index error: List Index out of range)
Python错误〜列表索引超出范围(Python Error~ List index out of range)
Python:列表索引中的列表超出范围错误(Python: list inside a list index out of range error)
Python中的索引超出范围错误(IndexError:列表索引超出范围)(Index out of range error in Python (IndexError: list index out of range))
python 3文件中的“列表索引超出范围”错误(“List index out of range” error in python 3 file)
错误:列表索引超出范围Python(Error: List Index out of Range Python)
Python:IndexError:列表索引超出范围错误(Python: IndexError: list index out of range Error)
相关文章
更多Python 列表(list)操作
python2和python3的区别
Python 写的Hadoop小程序
【转帖】Python 资源索引
Python资源索引 【转载】
利用SolrJ操作solr API完成index操作
使用mybatis执行sql的时候为什么会出现Parameter index out of range (1 > number of parameters, which is 0)?
Guava Range类-范围处理
spark--scala-douban模仿做了个python的版本
Python 字符串操作
最新问答
更多绝地求生、荒野行动、香肠派对 哪个更好玩???(都是吃鸡类游戏)
如何在jQuery集合中选择第n个jQuery对象?(How to select the nth jQuery object in a jQuery collection?)
ASP NET使用jQuery和AJAX上传图像(ASP NET upload image with jQuery and AJAX)
SQL Server XML查询中包含名称空间的位置(SQL Server XML query with namespaces in the where exist)
宁夏银川永宁县望远镇哪里有修mp5的?
我想用更新的日期标记所有更新的行(I would like to mark all updated rows with the date that they have been updated)
郑州会计培训班
如何定位数组中的负数,并得到所有正数的总和?(How to target e negative number from an array, and get the sum of all positive numbers?)
在响应图像上叠加网格(Overlay grid on responsive image)
无法让POST在Azure网站上运行(Could not get POST to work on Azure Website)
Copyright ©2023 656463.com All Rights Reserved.滇ICP备2022006988号-50
本站部分内容来源于互联网,仅供学习和参考使用,请莫用于商业用途。如有侵犯你的版权,请联系我们,本站将尽快处理。谢谢合作!