Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #100551

Re: Help on error " ValueError: For numerical factors, num_columns must be an int "

X-Received by 10.13.215.13 with SMTP id z13mr3575269ywd.7.1450319855532; Wed, 16 Dec 2015 18:37:35 -0800 (PST)
X-Received by 10.50.73.202 with SMTP id n10mr25584igv.8.1450319855503; Wed, 16 Dec 2015 18:37:35 -0800 (PST)
Path csiph.com!xmission!usenet.blueworldhosting.com!feeder01.blueworldhosting.com!peer03.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!g67no1298703qgd.1!news-out.google.com!f6ni25142igq.0!nntp.google.com!mv3no17199088igc.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups comp.lang.python
Date Wed, 16 Dec 2015 18:37:34 -0800 (PST)
In-Reply-To <a17dc6a5-fc55-4c5e-9852-403928694ed9@googlegroups.com>
Complaints-To groups-abuse@google.com
Injection-Info glegroupsg2000goo.googlegroups.com; posting-host=50.100.117.144; posting-account=SZ_svQkAAACWRFG2bDA-zgq8ILyl4-vo
NNTP-Posting-Host 50.100.117.144
References <cb78beb6-7a28-4bb5-8215-8771f1f324e3@googlegroups.com> <mailman.7.1450265638.30845.python-list@python.org> <51b673c2-589d-4141-8b80-ef17318a9218@googlegroups.com> <a17dc6a5-fc55-4c5e-9852-403928694ed9@googlegroups.com>
User-Agent G2/1.0
MIME-Version 1.0
Message-ID <dec2bb9d-dfd1-48ea-bd59-15eab60b2594@googlegroups.com> (permalink)
Subject Re: Help on error " ValueError: For numerical factors, num_columns must be an int "
From Robert <rxjwg98@gmail.com>
Injection-Date Thu, 17 Dec 2015 02:37:35 +0000
Content-Type text/plain; charset=ISO-8859-1
X-Received-Bytes 6903
X-Received-Body-CRC 1817099149
Xref csiph.com comp.lang.python:100551

Show key headers only | View raw


On Wednesday, December 16, 2015 at 8:57:30 PM UTC-5, Josef Pktd wrote:
> On Wednesday, December 16, 2015 at 9:50:35 AM UTC-5, Robert wrote:
> > On Wednesday, December 16, 2015 at 6:34:21 AM UTC-5, Mark Lawrence wrote:
> > > On 16/12/2015 10:44, Robert wrote:
> > > > Hi,
> > > >
> > > > When I run the following code, there is an error:
> > > >
> > > > ValueError: For numerical factors, num_columns must be an int
> > > >
> > > >
> > > > ================
> > > > import numpy as np
> > > > import pandas as pd
> > > > from patsy import dmatrices
> > > > from sklearn.linear_model import LogisticRegression
> > > >
> > > > X = [0.5,0.75,1.0,1.25,1.5,1.75,1.75,2.0,2.25,2.5,2.75,3.0,3.25,
> > > > 3.5,4.0,4.25,4.5,4.75,5.0,5.5]
> > > > y = [0,0,0,0,0,0,1,0,1,0,1,0,1,0,1,1,1,1,1,1]
> > > >
> > > > zipped = list(zip(X,y))
> > > > df = pd.DataFrame(zipped,columns = ['study_hrs','p_or_f'])
> > > >
> > > > y, X = dmatrices('p_or_f ~ study_hrs', df, return_type="dataframe")
> > > > =======================
> > > >
> > > > I have check 'df' is this type:
> > > > =============
> > > > type(df)
> > > > Out[25]: pandas.core.frame.DataFrame
> > > > =============
> > > >
> > > > I cannot figure out where the problem is. Can you help me?
> > > > Thanks.
> > > >
> > > > Error message:
> > > > ..........
> > > >
> > > >
> > > > ---------------------------------------------------------------------------
> > > > ValueError                                Traceback (most recent call last)
> > > > C:\Users\rj\pyprj\stackoverflow_logisticregression0.py in <module>()
> > > >       17 df = pd.DataFrame(zipped,columns = ['study_hrs','p_or_f'])
> > > >       18
> > > > ---> 19 y, X = dmatrices('p_or_f ~ study_hrs', df, return_type="dataframe")
> > > >       20
> > > >       21 y = np.ravel(y)
> > > >
> > > > C:\Users\rj\AppData\Local\Enthought\Canopy\User\lib\site-packages\patsy\highlevel.pyc in dmatrices(formula_like, data, eval_env, NA_action, return_type)
> > > >      295     eval_env = EvalEnvironment.capture(eval_env, reference=1)
> > > >      296     (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env,
> > > > --> 297                                       NA_action, return_type)
> > > >      298     if lhs.shape[1] == 0:
> > > >      299         raise PatsyError("model is missing required outcome variables")
> > > >
> > > > C:\Users\rj\AppData\Local\Enthought\Canopy\User\lib\site-packages\patsy\highlevel.pyc in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
> > > >      150         return iter([data])
> > > >      151     design_infos = _try_incr_builders(formula_like, data_iter_maker, eval_env,
> > > > --> 152                                       NA_action)
> > > >      153     if design_infos is not None:
> > > >      154         return build_design_matrices(design_infos, data,
> > > >
> > > > C:\Users\rj\AppData\Local\Enthought\Canopy\User\lib\site-packages\patsy\highlevel.pyc in _try_incr_builders(formula_like, data_iter_maker, eval_env, NA_action)
> > > >       55                                       data_iter_maker,
> > > >       56                                       eval_env,
> > > > ---> 57                                       NA_action)
> > > >       58     else:
> > > >       59         return None
> > > >
> > > > C:\Users\rj\AppData\Local\Enthought\Canopy\User\lib\site-packages\patsy\build.pyc in design_matrix_builders(termlists, data_iter_maker, eval_env, NA_action)
> > > >      704                             factor_states[factor],
> > > >      705                             num_columns=num_column_counts[factor],
> > > > --> 706                             categories=None)
> > > >      707         else:
> > > >      708             assert factor in cat_levels_contrasts
> > > >
> > > > C:\Users\rj\AppData\Local\Enthought\Canopy\User\lib\site-packages\patsy\design_info.pyc in __init__(self, factor, type, state, num_columns, categories)
> > > >       86         if self.type == "numerical":
> > > >       87             if not isinstance(num_columns, int):
> > > > ---> 88                 raise ValueError("For numerical factors, num_columns "
> > > >       89                                  "must be an int")
> > > >       90             if categories is not None:
> > > >
> > > > ValueError: For numerical factors, num_columns must be an int
> > > >
> > > 
> > > Slap the ValueError into a search engine and the first hit is 
> > > https://groups.google.com/forum/#!topic/pystatsmodels/KcSzNqDxv-Q
> 
> This was fixed in patsy 0.4.1 as discussed in this statsmodels thread.
> You need to upgrade patsy from 0.4.0.
> 
> AFAIR, the type checking was too strict and broke with recent numpy versions.
> 
> Josef
> 
> 
> > > 
> > > -- 
> > > My fellow Pythonistas, ask not what our language can do for you, ask
> > > what you can do for our language.
> > > 
> > > Mark Lawrence
> > 
> > Hi,
> > I don't see a solution to my problem. I find the following demo code from 
> > 
> > https://patsy.readthedocs.org/en/v0.1.0/API-reference.html#patsy.dmatrix
> > 
> > It doesn't work either on the Canopy. Does it work on your computer?
> > Thanks,
> > 
> > /////////////
> > demo_data("a", "x", nlevels=3)
> > Out[134]: 
> > {'a': ['a1', 'a2', 'a3', 'a1', 'a2', 'a3'],
> >  'x': array([ 1.76405235,  0.40015721,  0.97873798,  2.2408932 ,  1.86755799,
> >         -0.97727788])}
> > 
> > mat = dmatrix("a + x", demo_data("a", "x", nlevels=3))

Thanks. It is right.

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Help on error " ValueError: For numerical factors, num_columns must be an int " Robert <rxjwg98@gmail.com> - 2015-12-16 02:44 -0800
  Re: Help on error " ValueError: For numerical factors, num_columns must be an int " Robert <rxjwg98@gmail.com> - 2015-12-16 02:56 -0800
    Re: Help on error " ValueError: For numerical factors, num_columns must be an int " Robert <rxjwg98@gmail.com> - 2015-12-16 03:03 -0800
  Re: Help on error " ValueError: For numerical factors, num_columns must be an int " Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-12-16 11:33 +0000
    Re: Help on error " ValueError: For numerical factors, num_columns must be an int " Robert <rxjwg98@gmail.com> - 2015-12-16 06:50 -0800
      Re: Help on error " ValueError: For numerical factors, num_columns must be an int " Josef Pktd <josef.pktd@gmail.com> - 2015-12-16 17:57 -0800
        Re: Help on error " ValueError: For numerical factors, num_columns must be an int " Robert <rxjwg98@gmail.com> - 2015-12-16 18:37 -0800

csiph-web