Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Val Krem Newsgroups: comp.lang.python Subject: Re: Read and count Date: Thu, 10 Mar 2016 11:09:23 -0600 Lines: 115 Message-ID: References: <2095750566.7009618.1457559033672.JavaMail.yahoo.ref@mail.yahoo.com> Mime-Version: 1.0 (1.0) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Trace: news.uni-berlin.de 1R0NzxCscCT10vQQQu0D6AW14Ix3cEJCaiT89enP6BJA== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python)': 0.05; 'newline': 0.07; 'val': 0.07; 'cc:addr:python-list': 0.09; 'collections': 0.09; 'csv': 0.09; 'iterate': 0.09; 'learner': 0.09; 'observation': 0.09; 'tuple': 0.09; 'python': 0.10; '(moving': 0.16; 'counter()': 0.16; 'line.split()': 0.16; 'non-empty': 0.16; 'parentheses': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'wrote:': 0.16; 'string': 0.17; '2001': 0.18; 'skip': 0.18; 'string,': 0.18; 'all,': 0.20; 'library': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'year,': 0.22; 'keys': 0.22; 'sep': 0.22; 'trying': 0.22; 'cc:no real name:2**0': 0.22; 'am,': 0.23; 'help.': 0.23; "python's": 0.23; 'third-party': 0.23; 'tried': 0.24; 'import': 0.24; 'header:In-Reply-To:1': 0.24; 'header': 0.24; 'sort': 0.25; 'module': 0.25; 'least': 0.27; 'fine': 0.28; 'dictionary': 0.29; 'omitted': 0.29; 'str': 0.29; 'character': 0.29; 'objects': 0.29; 'print': 0.30; 'url:mailman': 0.30; 'code': 0.30; 'statement': 0.32; 'url:python': 0.33; 'url:listinfo': 0.34; 'file': 0.34; 'worked': 0.34; 'city.': 0.35; 'library.': 0.35; 'but': 0.36; 'url:org': 0.36; 'lines': 0.36; 'closing': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'thanks': 0.37; 'charset:us- ascii': 0.37; 'starting': 0.37; 'things': 0.38; 'thank': 0.38; 'end': 0.39; 'test': 0.39; 'data': 0.39; 'url:mail': 0.40; 'called': 0.40; 'some': 0.40; 'header:Message-Id:1': 0.61; 'default': 0.61; 'received:68.142': 0.63; 'city': 0.65; 'mar': 0.65; 'here': 0.66; 'received:bullet.mail.bf1.yahoo.com': 0.72; 'special': 0.73; '2002': 0.79; 'counts': 0.81; '2001.': 0.84; 'received:68.142.230': 0.84; 'received:98.139.213': 0.91; 'ipad': 0.95 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1457629764; bh=c1WgAlBEGQAqV4pCB9HMUiITWPo3Exzl/ZIkaxnWLsM=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject; b=fKe7EzHKqWCJxPWtzoj3OTZPvbZbSuxp+HEp9L8OZkCAkyB9sLJpnWiPhCzy5iVAd5k4LULt4UNzOXDHj3ddqeEVoGLoSL5wZmi6uuDPjow1P2YD67n0LInoGXEaFNBNuT6hVT+JkR7AmXjNHSeZFBwXPM5Ee7SCKHlI1YpLVPrhyz1x9GEm7SciMV8f64e+cwhNVpLjhRaZXMAqBw6Z7jyVfvn/rMkeQ+bouTCGpALOPsu1utUa5vFcIX1xc0+BNMjtVo3ExXIsSfoylEZDDs48IDlOv1cfPOp/eycB96rDRAcHUe2pQ7rYvDGf7IJ144yOYNz8qzKt1/m81S32nw== X-Yahoo-Newman-Id: 386763.74250.bm@smtp228.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: BS2VDR8VM1mk7UzNbY4x.Th9B2MjW1wGxtuoeCHAkrhinaK 5P4MFlih2cMiapYrgTsIsp13L7r89zftdZC44LZglM1quJLlemZjEEQu6SaB NC.Fbc5kOkXc8GWczmQYwn8ZVtUWYffU5vNlYLpgv8oCCFMMzKj974KORn4O dsz_DJ00_30q0yRB.2cPLU_rq_Parprw8OpkibSCFpRZO5zbxpIqfaj66oVE 7rBLYOAqDbK.Z3dZNRFxhZljQIggvIJ5870Pn4JImM8bXsyVJRTaURNsINPH vvryCrrAv6IPGSQYdkqpwKr4EMaGxq6qErr_AKkhvRp9_wgfB6Ouz0eNqC0q 2xRz9ZaF11K51A3y4FPFG7N3hmPPwXKPXwrccAWpo17aBAoj_7_8jchP88W2 gJPumhsvJ95G10QwRW1yBnWnkJuovfBP9Udxudab2KYqKIwkVS6n5se7TOSB 6FGX6R.gj2Id4uRbBTmv_EuvUsak.rYxibEhVWhxmSIomIV42RJcDMhqCH4b Iwb.64g7vFq6fR2sFQxjEy8StM0qtAQ4- X-Yahoo-SMTP: Fev.jEqswBCjq.mbS0rvMoaLdzM- X-Mailer: iPhone Mail (13D15) In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:104535 Thank you very much for the help. First I want count by city and year.=20 City year count Xc1. 2001. 1 Xc1. 2002. 3 Yv1. 2001. 1 Yv2. 2002. 4 This worked fine ! Now I want to count by city only City. Count Xc1. 4 Yv2. 5 Then combine these two objects with the original data and send it to a file c= alled "detout" with these columns: "City", " year ", "x ", "cycount ", "citycount" Many thanks again This worked fine. I tried to count only by city and combine the three objec= ts together=20 City Xc1 4 Yv2 5 Sent from my iPad=20 > On Mar 10, 2016, at 3:11 AM, Jussi Piitulainen wrote: >=20 > Val Krem writes: >=20 >> Hi all, >>=20 >> I am a new learner about python (moving from R to python) and trying >> read and count the number of observation by year for each city. >>=20 >>=20 >> The data set look like >> city year x=20 >>=20 >> XC1 2001 10 >> XC1 2001 20 >> XC1 2002 20 >> XC1 2002 10 >> XC1 2002 10 >>=20 >> Yv2 2001 10 >> Yv2 2002 20 >> Yv2 2002 20 >> Yv2 2002 10 >> Yv2 2002 10 >>=20 >> out put will be >>=20 >> city >> xc1 2001 2 >> xc1 2002 3 >> yv1 2001 1 >> yv2 2002 3 >>=20 >>=20 >> Below is my starting code >> count=3D0 >> fo=3Dopen("dat", "r+") >> str =3D fo.read(); >> print "Read String is : ", str >>=20 >> fo.close() >=20 > Below's some of the basics that you want to study. Also look up the csv > module in Python's standard library. You will want to learn these things > even if you end up using some sort of third-party data-frame library (I > don't know those but they exist). >=20 > from collections import Counter >=20 > # collections.Counter is a special dictionary type for just this > counts =3D Counter() >=20 > # with statement ensures closing the file > with open("dat") as fo: > # file object provides lines > next(fo) # skip header line > for line in fo: > # test requires non-empty string, but lines > # contain at least newline character so ok > if line.isspace(): continue > # .split() at whitespace, omits empty fields > city, year, x =3D line.split() > # collections.Counter has default 0, > # key is a tuple (city, year), parentheses omitted here > counts[city, year] +=3D 1 >=20 > print("city") > for city, year in sorted(counts): # iterate over keys > print(city.lower(), year, counts[city, year], sep =3D "\t") >=20 > # Alternatively: > # for cy, n in sorted(counts.items()): > # city, year =3D cy > # print(city.lower(), year, n, sep =3D "\t") > --=20 > https://mail.python.org/mailman/listinfo/python-list