lundi 29 juin 2015

SciPy package on Fedora 20: ImportError: cannot import name anderson_ksamp


I'm trying to run a Python package called D3E for single-cell differential gene expression. I have Python 2.7.5 on Fedora 20. I just installed the SciPy package using the instructions here:

sudo yum install numpy scipy python-matplotlib ipython python-pandas sympy python-nose

However, when I try to run the script, I keep getting a SciPy error:

bash-4.2$ python D3ECmd.py ~/Documents/geneExpressionTable.txt ~/outputFile.txt cellType1 cellTYpe2 -n=0, -z=1

    Traceback (most recent call last):
      File "D3ECmd.py", line 34, in <module>
        from D3EUtil import readData, getParamsBayesian, getParamsMoments, cramerVonMises, logStatus, goodnessOfFit, distributionTest
      File "/home/user/Software/D3E/D3EUtil.py", line 36, in <module>
        from scipy.stats import gmean, ks_2samp, anderson_ksamp
    ImportError: cannot import name anderson_ksamp

What would you recommend I try to fix this error?

Thanks.


How to place text above hatched zone using matplotlib?


How to place text above hatched zone using matplotlib? An example is shown below. enter image description here Below are my sample code for easy understanding

from pylab import *
fig=figure()
x=array([0,1])
yc=array([0.55,0.48])


yhc=array([0.55,0.68])

yagg=array([0.45,0.48])

plot(x,yc,'k-',linewidth=1.5)
plot(x,yhc,'k-',linewidth=1.5)
plot(x,yagg,'k-',linewidth=1.5)

xticks(fontsize = 22)
yticks(fontsize = 22)
ylim(0,1)
ax=axes()


p=fill_between(x, yc, yhc,color="none")

from matplotlib.patches import PathPatch 
for path in p.get_paths(): 
    p1 = PathPatch(path, fc="none", hatch="/") 
    ax.add_patch(p1) 
    p1.set_zorder(p.get_zorder()-0.1) 

props = dict(boxstyle='round',facecolor='white', alpha=1,frameon='false')
text(0.6, 0.55, 'hi',fontsize=22)
fig.savefig('vp.png', transparent=True,bbox_inches='tight')

enter image description here

The hatched zone really makes it hard to see the text.


Adding labels iteratively to Tkinter form


I am trying to iteratively add fields to a Tkinter form by looping through a list. The form generates with no errors, but the labels are not present. What is going on here?

from Tkinter import *
class User_Input:

    def __init__(self, parent):
        fields = ['Text Box 1', 'Text Box 2']
        GUIFrame =Frame(parent, width=300, height=200)
        GUIFrame.pack(expand=False, anchor=CENTER)
        field_index = 10
        for field in fields:
            self.field = Entry(text=field)
            self.field.place(x=65, y = field_index)
            field_index += 25

        self.Button2 = Button(parent, text='exit', command= parent.quit)
        self.Button2.place(x=160, y=60)

root = Tk()
MainFrame =User_Input(root)

root.mainloop()


Viewing the content of a Spark Dataframe Column


I'm using Spark 1.3.1.

I am trying to view the values of a Spark dataframe column in Python. With a Spark dataframe, I can do df.collect() to view the contents of the dataframe, but there is no such method for a Spark dataframe column as best as I can see.

For example, the dataframe df contains a column named 'zip_code'. So I can do df['zip_code'] and it turns a pyspark.sql.dataframe.Column type, but I can't find a way to view the values in df['zip_code'].


Python suds attirbuteError: Fault instance has no attribute 'detail'


Howdie do,

I've created a package class that sets some default values such as shipper, consignee, packages and commodities.

The issue is that when I go to call the method shippackage that is a suds client, I receive the error:

AttributeError: Fault instance has no attribute 'detail'

The first file listed below is my test file which sets the necessary dictionaries:

import ship

# Create a new package object
package1 = ship.Package('TX')

# Set consignee
package1.setconsignee('Dubs Enterprise', 'JW', '2175 14th Street', 'Troy', 'NY', '12180', 'USA')

# Set default packaging
package1.setdefaults('CUSTOM', '12/12/15', 'oktoleave')

# Add commodity description with each package
package1.addpackage('12.3 lbs', '124.00')
package1.addcommodity('This is package number 1', '23.5 lbs')

# Add commodity list to defaults dictionary
package1.setcommoditylist()

# Add package list to packages dictionary
package1.setpackagelist()
package1.shippackage()

The module that is being imported is the following:

from suds.client import Client
from suds.bindings import binding

binding.envns = ('SOAP-ENV', 'http://ift.tt/18hkEkn')
client = Client('http://localhost/ProgisticsAMP/amp.svc/wsdl', headers={'Content-Type': 'application/soap+xml'},
                faults=False)


class Package(object):

    def __init__(self, shipper):
        self.commoditylist = []
        self.commodityContents = {}
        self.consignee = {}
        self.defaults = {}
        self.packagelist = []
        self.packages = {}

        self.defaults['shipper'] = shipper

    def setconsignee(self, company, contact, address, city, state, zip, country):
        self.consignee['company'] = company
        self.consignee['contact'] = contact
        self.consignee['address1'] = address
        self.consignee['city'] = city
        self.consignee['stateProvince'] = state
        self.consignee['postalCode'] = zip
        self.consignee['countrySymbol'] = country

        self.defaults['consignee'] = self.consignee

    def setdefaults(self, packaging, shipdate, deliverymethod):
        self.defaults['packaging'] = packaging
        self.defaults['shipdate'] = shipdate
        self.defaults['deliveryMethod'] = deliverymethod

    def addcommodity(self, description, unitweight):
        commodity = {}
        commodity['description'] = description
        commodity['unitWeight'] = {'value': unitweight}

        self.commoditylist.append(commodity)

    def addpackage(self, weight, declarevalue):
        package = {}
        package['weight'] = {'value': weight}
        package['declaredValueAmount'] = {'amount': declarevalue, 'currency': 'USD'}
        self.packagelist.append(package)

    def setcommoditylist(self):
        self.commodityContents = {'item': self.commoditylist}

        self.defaults['commodityContents'] = self.commodityContents

    def setpackagelist(self):
        self.packages = {'item': self.packagelist}

    def shippackage(self):
        print self.defaults
        print self.packages

        # print client
        response = client.service.Ship('CONNECTSHIP_UPS.UPS.GND', self.defaults, self.packages, True, 'release')
        result = response.result
        print result

Within the shippackage method, I print self.packages and self.defaults and confirm that they have data within them before I call

client.service.Ship()

So I know it does have values within it before I call the client.service.Ship()

Am I missing something here? Why won't it take the defaults and packages dictionary that I've set?


How can I extract all satellite adjectives from WordNet NLTK and save them to a text file? (Python))


I am trying to extract all satellite adjective synsets from WordNet and save them to a text file. Note that satellite adjectives are denoted as 's' in the synset name, e.g., "(fantastic.s.02)". The following is my code:

    def extract_sat_adjectives():
      sat_adj_counter = 0
      sat_adjectives = []
      for i in wn.all_synsets():
        if i.pos() in ['s']:
          sat_adj_counter +=1
          sat_adjectives = sat_adjectives + [i.name()]
   fo = open("C:\\Users\\Nora\\Desktop\\satellite_adjectives.txt", "wb")
   for x in sat_adjectives:
     fo.write("%s\n" % x)
   fo.close()


extract_sat_adjectives()

The error I get is:

TypeError: 'str' does not support the buffer interface  

How can I save the adjectives to the text file? Thanks in advance.


Python goto text file line without reading previous lines


I am working with a very large text file (tsv) around 200 million entries. One of the column is date and records are sorted on date. Now I want to start reading the record from a given date. Currently I was just reading from start which is very slow since I need to read almost 100-150 million records just to reach that record. I was thinking if I can use binary search to speed it up, I can do away in just max 28 extra record reads (log(200 million)). Does python allow to read nth line without caching or reading lines before it?