Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #94407 > unrolled thread
| Started by | Robert Davis <rdavis7408@gmail.com> |
|---|---|
| First post | 2015-07-22 15:54 -0700 |
| Last post | 2015-07-24 06:17 -0700 |
| Articles | 6 — 3 participants |
Back to article view | Back to comp.lang.python
Find Minimum for element in multiple dimensional array Robert Davis <rdavis7408@gmail.com> - 2015-07-22 15:54 -0700
Re: Find Minimum for element in multiple dimensional array Emile van Sebille <emile@fenx.com> - 2015-07-22 16:26 -0700
Re: Find Minimum for element in multiple dimensional array Robert Davis <rdavis7408@gmail.com> - 2015-07-23 05:31 -0700
Re: Find Minimum for element in multiple dimensional array Robert Davis <rdavis7408@gmail.com> - 2015-07-23 14:50 -0700
Re: Find Minimum for element in multiple dimensional array Denis McMahon <denismfmcmahon@gmail.com> - 2015-07-24 00:18 +0000
Re: Find Minimum for element in multiple dimensional array Robert Davis <rdavis7408@gmail.com> - 2015-07-24 06:17 -0700
| From | Robert Davis <rdavis7408@gmail.com> |
|---|---|
| Date | 2015-07-22 15:54 -0700 |
| Subject | Find Minimum for element in multiple dimensional array |
| Message-ID | <a7fe6fd3-2a51-40a9-b995-33fc88cb27ee@googlegroups.com> |
Given a set of arrays within an array how do I find the arrays with the minimum values based on two elements/columns in the array? Those two elements/columns are the destination zip code and distance.
I have an array of arrays that have a origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
I need to keep only those combinations that represent the minimum mileage between to the destination zip code. For example a point in New Jersey may have a distance from the Philadelphia Office that is 45 miles, from the Newark Office that is 78 miles and one from the Delaware Office that is 58 miles.
I need to keep the mileage from the Philadelphia Office that is 45 miles and produce a .csv file that has origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
The array looks like this:
[['37015', 'TN31', 36.2777, -87.0046, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 77.338920003],
['72202', 'ARB1', 34.739224, -92.27765, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 1099.7837975322097]]
My code looks like this :
import csv
import math
def calculate_distance(lat1, lon1, lat2, lon2):
if (not lat1) or (not lon1) or (not lat2) or (not lon2):
return -1
lat1 = float(lat1) * math.pi/180
lon1 = float(lon1) * math.pi/180
lat2 = float(lat2) * math.pi/180
lon2 = float(lon2) * math.pi/180
return 3959.0 * math.acos(math.sin(lat1) * math.sin(lat2) + math.cos(lat1) * math.cos(lat2) * math.cos(lon2-lon1))
#Above function changed from the following URL: http://iamtgc.com/geocoding- with-python/
InputPath = "C:\\Users\\jacobs\\Downloads\\ZipCodes\\"
ZipCodes = "zipcode.csv"
RptgOfficeFile = "Reporting_Office_2015072001.csv"
InputFile = InputPath+RptgOfficeFile
zInputFile = InputPath+ZipCodes
zOutputFile = InputPath+'Zip_Code_Distance.csv'
z1OutputFile = InputPath+'Minimum_Distance_Zip_Code_File.csv'
f = open(InputFile, 'r')
zO = open(zOutputFile,'w')
z1 = open(z1OutputFile,'w')
lines = [ ]
OfficeZipcodes = []
ZipRptOffice = {}
OLatitude = [ ]
OLongitude = [ ]
OLocationCode = []
dzip = []
dLatitude = []
dLongitude = []
dCity = []
dState = []
Combined =[]
Answers = []
for line in f:
l = [i.strip() for i in line.split(',')]
OfficeZipcodes.append(l[4])
ZipRptOffice[l[4]]= l[3]
OLatitude.append(l[5])
OLongitude.append(l[6])
OLocationCode.append(l[3])
del OfficeZipcodes[0]
del OLatitude[0]
del OLongitude[0]
del OLocationCode[0]
zf = csv.DictReader(open(zInputFile))
#http://courses.cs.washington.edu/courses/cse140/13wi/csv-parsing.html
for row in zf:
dzip.append(row["zip"])
dLatitude.append(float(row["latitude"]))
dLongitude.append(float(row["longitude"]))
dCity.append(row["city"])
dState.append(row["state"])
for i in range(len(OfficeZipcodes)):
for j in range(len(dzip)):
Distance = calculate_distance(OLatitude[i], OLongitude[i],dLatitude[j],dLongitude[j])
Combined.append([OfficeZipcodes[i], OLocationCode[i],float(OLatitude[i]),float(OLongitude[i]),dState[j],dCity[j],dzip[j], dLatitude[j],dLongitude[j],Distance])
for i in range(len(Combined)):
zO.write(str(Combined[i][0])+","+str(Combined[i][1])+","+str(Combined[i][2])+","+ str(Combined[i][3])+","+str(Combined[i][4])+","+ str(Combined[i][5])+","+ str(Combined[i][6])+","+str(Combined[i][7])+","+ str(Combined[i][8])+","+str(Combined[i][9])+"\n")
zO.close()
f.close()
I am using Python 2.7 on a Windows 7 machine.
Please help me get my head around how to accomplish this task.
Thank you very much.
Robert Davis
[toc] | [next] | [standalone]
| From | Emile van Sebille <emile@fenx.com> |
|---|---|
| Date | 2015-07-22 16:26 -0700 |
| Message-ID | <mailman.891.1437607638.3674.python-list@python.org> |
| In reply to | #94407 |
On 7/22/2015 3:54 PM, Robert Davis wrote:
> I have an array of arrays that have a origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
>
> I need to keep only those combinations that represent the minimum mileage between to the destination zip code. For example a point in New Jersey may have a distance from the Philadelphia Office that is 45 miles, from the Newark Office that is 78 miles and one from the Delaware Office that is 58 miles.
>
> I need to keep the mileage from the Philadelphia Office that is 45 miles and produce a .csv file that has origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
>
> The array looks like this:
>
> [['37015', 'TN31', 36.2777, -87.0046, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 77.338920003],
> ['72202', 'ARB1', 34.739224, -92.27765, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 1099.7837975322097]]
Assume the array in A:
---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---
A= [['37015', 'TN31', 36.2777, -87.0046, 'NY', 'White Plains', '10629',
41.119008, -73.732996, 77.338920003],
['72202', 'ARB1', 34.739224, -92.27765, 'NY', 'White Plains', '10629',
41.119008, -73.732996, 1099.7837975322097]]
# transform to a dict ignoring dups
D = dict( [ ( ((r[6],r[0]),r[-1]), r) for r in A ] )
# convert to a sorted list
L = sorted(D.items())
# then print and filter out any duplicated entries
lastzippair = (None,None)
for ky,rec in L:
if ky[:2] == lastzippair:
continue
print ky,":",rec
lastzippair=ky[:2]
---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<---
The results are what you'd write to the csv file.
Tested only with the data you provided.
HTH,
Emile
[toc] | [prev] | [next] | [standalone]
| From | Robert Davis <rdavis7408@gmail.com> |
|---|---|
| Date | 2015-07-23 05:31 -0700 |
| Message-ID | <01420576-dbe1-4645-9d76-a6b56ab99999@googlegroups.com> |
| In reply to | #94407 |
On Wednesday, July 22, 2015 at 5:54:30 PM UTC-5, Robert Davis wrote:
> Given a set of arrays within an array how do I find the arrays with the minimum values based on two elements/columns in the array? Those two elements/columns are the destination zip code and distance.
>
> I have an array of arrays that have a origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
>
> I need to keep only those combinations that represent the minimum mileage between to the destination zip code. For example a point in New Jersey may have a distance from the Philadelphia Office that is 45 miles, from the Newark Office that is 78 miles and one from the Delaware Office that is 58 miles.
>
> I need to keep the mileage from the Philadelphia Office that is 45 miles and produce a .csv file that has origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
>
> The array looks like this:
>
> [['37015', 'TN31', 36.2777, -87.0046, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 77.338920003],
> ['72202', 'ARB1', 34.739224, -92.27765, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 1099.7837975322097]]
>
> My code looks like this :
>
> import csv
> import math
>
>
> def calculate_distance(lat1, lon1, lat2, lon2):
>
> if (not lat1) or (not lon1) or (not lat2) or (not lon2):
> return -1
>
> lat1 = float(lat1) * math.pi/180
> lon1 = float(lon1) * math.pi/180
> lat2 = float(lat2) * math.pi/180
> lon2 = float(lon2) * math.pi/180
>
> return 3959.0 * math.acos(math.sin(lat1) * math.sin(lat2) + math.cos(lat1) * math.cos(lat2) * math.cos(lon2-lon1))
>
> #Above function changed from the following URL: http://iamtgc.com/geocoding- with-python/
>
>
> InputPath = "C:\\Users\\jacobs\\Downloads\\ZipCodes\\"
>
> ZipCodes = "zipcode.csv"
> RptgOfficeFile = "Reporting_Office_2015072001.csv"
> InputFile = InputPath+RptgOfficeFile
> zInputFile = InputPath+ZipCodes
> zOutputFile = InputPath+'Zip_Code_Distance.csv'
> z1OutputFile = InputPath+'Minimum_Distance_Zip_Code_File.csv'
>
>
> f = open(InputFile, 'r')
>
> zO = open(zOutputFile,'w')
> z1 = open(z1OutputFile,'w')
>
> lines = [ ]
> OfficeZipcodes = []
> ZipRptOffice = {}
> OLatitude = [ ]
> OLongitude = [ ]
> OLocationCode = []
> dzip = []
> dLatitude = []
> dLongitude = []
> dCity = []
> dState = []
> Combined =[]
> Answers = []
>
> for line in f:
> l = [i.strip() for i in line.split(',')]
> OfficeZipcodes.append(l[4])
> ZipRptOffice[l[4]]= l[3]
> OLatitude.append(l[5])
> OLongitude.append(l[6])
> OLocationCode.append(l[3])
>
> del OfficeZipcodes[0]
> del OLatitude[0]
> del OLongitude[0]
> del OLocationCode[0]
>
>
> zf = csv.DictReader(open(zInputFile))
> #http://courses.cs.washington.edu/courses/cse140/13wi/csv-parsing.html
>
> for row in zf:
> dzip.append(row["zip"])
> dLatitude.append(float(row["latitude"]))
> dLongitude.append(float(row["longitude"]))
> dCity.append(row["city"])
> dState.append(row["state"])
>
>
> for i in range(len(OfficeZipcodes)):
> for j in range(len(dzip)):
> Distance = calculate_distance(OLatitude[i], OLongitude[i],dLatitude[j],dLongitude[j])
> Combined.append([OfficeZipcodes[i], OLocationCode[i],float(OLatitude[i]),float(OLongitude[i]),dState[j],dCity[j],dzip[j], dLatitude[j],dLongitude[j],Distance])
> for i in range(len(Combined)):
> zO.write(str(Combined[i][0])+","+str(Combined[i][1])+","+str(Combined[i][2])+","+ str(Combined[i][3])+","+str(Combined[i][4])+","+ str(Combined[i][5])+","+ str(Combined[i][6])+","+str(Combined[i][7])+","+ str(Combined[i][8])+","+str(Combined[i][9])+"\n")
>
> zO.close()
> f.close()
>
> I am using Python 2.7 on a Windows 7 machine.
>
> Please help me get my head around how to accomplish this task.
>
> Thank you very much.
>
> Robert Davis
Emile,
Thank you I will give it a try and see if I can get it to work.
I really do appreciate your effort.
Robert
[toc] | [prev] | [next] | [standalone]
| From | Robert Davis <rdavis7408@gmail.com> |
|---|---|
| Date | 2015-07-23 14:50 -0700 |
| Message-ID | <462065e7-88b6-4324-86e6-8f15a62c5c7e@googlegroups.com> |
| In reply to | #94407 |
On Wednesday, July 22, 2015 at 5:54:30 PM UTC-5, Robert Davis wrote:
> Given a set of arrays within an array how do I find the arrays with the minimum values based on two elements/columns in the array? Those two elements/columns are the destination zip code and distance.
>
> I have an array of arrays that have a origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
>
> I need to keep only those combinations that represent the minimum mileage between to the destination zip code. For example a point in New Jersey may have a distance from the Philadelphia Office that is 45 miles, from the Newark Office that is 78 miles and one from the Delaware Office that is 58 miles.
>
> I need to keep the mileage from the Philadelphia Office that is 45 miles and produce a .csv file that has origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
>
> The array looks like this:
>
> [['37015', 'TN31', 36.2777, -87.0046, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 77.338920003],
> ['72202', 'ARB1', 34.739224, -92.27765, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 1099.7837975322097]]
>
> My code looks like this :
>
> import csv
> import math
>
>
> def calculate_distance(lat1, lon1, lat2, lon2):
>
> if (not lat1) or (not lon1) or (not lat2) or (not lon2):
> return -1
>
> lat1 = float(lat1) * math.pi/180
> lon1 = float(lon1) * math.pi/180
> lat2 = float(lat2) * math.pi/180
> lon2 = float(lon2) * math.pi/180
>
> return 3959.0 * math.acos(math.sin(lat1) * math.sin(lat2) + math.cos(lat1) * math.cos(lat2) * math.cos(lon2-lon1))
>
> #Above function changed from the following URL: http://iamtgc.com/geocoding- with-python/
>
>
> InputPath = "C:\\Users\\jacobs\\Downloads\\ZipCodes\\"
>
> ZipCodes = "zipcode.csv"
> RptgOfficeFile = "Reporting_Office_2015072001.csv"
> InputFile = InputPath+RptgOfficeFile
> zInputFile = InputPath+ZipCodes
> zOutputFile = InputPath+'Zip_Code_Distance.csv'
> z1OutputFile = InputPath+'Minimum_Distance_Zip_Code_File.csv'
>
>
> f = open(InputFile, 'r')
>
> zO = open(zOutputFile,'w')
> z1 = open(z1OutputFile,'w')
>
> lines = [ ]
> OfficeZipcodes = []
> ZipRptOffice = {}
> OLatitude = [ ]
> OLongitude = [ ]
> OLocationCode = []
> dzip = []
> dLatitude = []
> dLongitude = []
> dCity = []
> dState = []
> Combined =[]
> Answers = []
>
> for line in f:
> l = [i.strip() for i in line.split(',')]
> OfficeZipcodes.append(l[4])
> ZipRptOffice[l[4]]= l[3]
> OLatitude.append(l[5])
> OLongitude.append(l[6])
> OLocationCode.append(l[3])
>
> del OfficeZipcodes[0]
> del OLatitude[0]
> del OLongitude[0]
> del OLocationCode[0]
>
>
> zf = csv.DictReader(open(zInputFile))
> #http://courses.cs.washington.edu/courses/cse140/13wi/csv-parsing.html
>
> for row in zf:
> dzip.append(row["zip"])
> dLatitude.append(float(row["latitude"]))
> dLongitude.append(float(row["longitude"]))
> dCity.append(row["city"])
> dState.append(row["state"])
>
>
> for i in range(len(OfficeZipcodes)):
> for j in range(len(dzip)):
> Distance = calculate_distance(OLatitude[i], OLongitude[i],dLatitude[j],dLongitude[j])
> Combined.append([OfficeZipcodes[i], OLocationCode[i],float(OLatitude[i]),float(OLongitude[i]),dState[j],dCity[j],dzip[j], dLatitude[j],dLongitude[j],Distance])
> for i in range(len(Combined)):
> zO.write(str(Combined[i][0])+","+str(Combined[i][1])+","+str(Combined[i][2])+","+ str(Combined[i][3])+","+str(Combined[i][4])+","+ str(Combined[i][5])+","+ str(Combined[i][6])+","+str(Combined[i][7])+","+ str(Combined[i][8])+","+str(Combined[i][9])+"\n")
>
> zO.close()
> f.close()
>
> I am using Python 2.7 on a Windows 7 machine.
>
> Please help me get my head around how to accomplish this task.
>
> Thank you very much.
>
> Robert Davis
Emile,
Thanks so much works wonderfully. It is certainly a brilliant way of seeing the resolution.
Robert
[toc] | [prev] | [next] | [standalone]
| From | Denis McMahon <denismfmcmahon@gmail.com> |
|---|---|
| Date | 2015-07-24 00:18 +0000 |
| Message-ID | <mos086$fpa$1@dont-email.me> |
| In reply to | #94407 |
On Wed, 22 Jul 2015 15:54:06 -0700, Robert Davis wrote:
> Given a set of arrays within an array how do I find the arrays with the
> minimum values based on two elements/columns in the array? Those two
> elements/columns are the destination zip code and distance.
create a new dictionary
for each source/destination pair in your list of source destination pairs:
if the destination zip code is not in the new dictionary, copy the entry
to the new dictionary keyed on the destination zip code.
if the destination zip code is in the new dictionary, copy the entry to
the new dictionary keyed on the destination zip code only if the distance
is less than the distance of the current entry in the new dictionary.
convert the values of the new dictionary to a list.
write the list as csv
Here is an example, note that I'm just using 2 bits of data, the first
bit of data in each sub list simulates the destination area code, and the
second simulates the distance from the associated source zip code.
Obviously you need to adjust these to match the actual parameters in your
list of lists.
import csv
info = [['a',15],['a',17],['a',21],['b',96],['b',45],['b',38],['c',71],
['c',18],['c',54]]
tmp = {}
for thing in info:
if thing[0] in tmp:
if thing[1] < tmp[thing[0]][1]:
tmp[thing[0]] = thing
else:
tmp[thing[0]] = thing
with open("output.csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(tmp.values())
and lo:
$ cat output.csv
a,15
c,18
b,38
$
--
Denis McMahon, denismfmcmahon@gmail.com
[toc] | [prev] | [next] | [standalone]
| From | Robert Davis <rdavis7408@gmail.com> |
|---|---|
| Date | 2015-07-24 06:17 -0700 |
| Message-ID | <02bb73b5-7a8a-43fa-ab5d-2325464f3d81@googlegroups.com> |
| In reply to | #94407 |
On Wednesday, July 22, 2015 at 5:54:30 PM UTC-5, Robert Davis wrote:
> Given a set of arrays within an array how do I find the arrays with the minimum values based on two elements/columns in the array? Those two elements/columns are the destination zip code and distance.
>
> I have an array of arrays that have a origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
>
> I need to keep only those combinations that represent the minimum mileage between to the destination zip code. For example a point in New Jersey may have a distance from the Philadelphia Office that is 45 miles, from the Newark Office that is 78 miles and one from the Delaware Office that is 58 miles.
>
> I need to keep the mileage from the Philadelphia Office that is 45 miles and produce a .csv file that has origin zip code, origin latitude, origin longitude, destination zip code, destination latitude, destination longitude, and miles between the two points.
>
> The array looks like this:
>
> [['37015', 'TN31', 36.2777, -87.0046, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 77.338920003],
> ['72202', 'ARB1', 34.739224, -92.27765, 'NY', 'White Plains', '10629', 41.119008, -73.732996, 1099.7837975322097]]
>
> My code looks like this :
>
> import csv
> import math
>
>
> def calculate_distance(lat1, lon1, lat2, lon2):
>
> if (not lat1) or (not lon1) or (not lat2) or (not lon2):
> return -1
>
> lat1 = float(lat1) * math.pi/180
> lon1 = float(lon1) * math.pi/180
> lat2 = float(lat2) * math.pi/180
> lon2 = float(lon2) * math.pi/180
>
> return 3959.0 * math.acos(math.sin(lat1) * math.sin(lat2) + math.cos(lat1) * math.cos(lat2) * math.cos(lon2-lon1))
>
> #Above function changed from the following URL: http://iamtgc.com/geocoding- with-python/
>
>
> InputPath = "C:\\Users\\jacobs\\Downloads\\ZipCodes\\"
>
> ZipCodes = "zipcode.csv"
> RptgOfficeFile = "Reporting_Office_2015072001.csv"
> InputFile = InputPath+RptgOfficeFile
> zInputFile = InputPath+ZipCodes
> zOutputFile = InputPath+'Zip_Code_Distance.csv'
> z1OutputFile = InputPath+'Minimum_Distance_Zip_Code_File.csv'
>
>
> f = open(InputFile, 'r')
>
> zO = open(zOutputFile,'w')
> z1 = open(z1OutputFile,'w')
>
> lines = [ ]
> OfficeZipcodes = []
> ZipRptOffice = {}
> OLatitude = [ ]
> OLongitude = [ ]
> OLocationCode = []
> dzip = []
> dLatitude = []
> dLongitude = []
> dCity = []
> dState = []
> Combined =[]
> Answers = []
>
> for line in f:
> l = [i.strip() for i in line.split(',')]
> OfficeZipcodes.append(l[4])
> ZipRptOffice[l[4]]= l[3]
> OLatitude.append(l[5])
> OLongitude.append(l[6])
> OLocationCode.append(l[3])
>
> del OfficeZipcodes[0]
> del OLatitude[0]
> del OLongitude[0]
> del OLocationCode[0]
>
>
> zf = csv.DictReader(open(zInputFile))
> #http://courses.cs.washington.edu/courses/cse140/13wi/csv-parsing.html
>
> for row in zf:
> dzip.append(row["zip"])
> dLatitude.append(float(row["latitude"]))
> dLongitude.append(float(row["longitude"]))
> dCity.append(row["city"])
> dState.append(row["state"])
>
>
> for i in range(len(OfficeZipcodes)):
> for j in range(len(dzip)):
> Distance = calculate_distance(OLatitude[i], OLongitude[i],dLatitude[j],dLongitude[j])
> Combined.append([OfficeZipcodes[i], OLocationCode[i],float(OLatitude[i]),float(OLongitude[i]),dState[j],dCity[j],dzip[j], dLatitude[j],dLongitude[j],Distance])
> for i in range(len(Combined)):
> zO.write(str(Combined[i][0])+","+str(Combined[i][1])+","+str(Combined[i][2])+","+ str(Combined[i][3])+","+str(Combined[i][4])+","+ str(Combined[i][5])+","+ str(Combined[i][6])+","+str(Combined[i][7])+","+ str(Combined[i][8])+","+str(Combined[i][9])+"\n")
>
> zO.close()
> f.close()
>
> I am using Python 2.7 on a Windows 7 machine.
>
> Please help me get my head around how to accomplish this task.
>
> Thank you very much.
>
> Robert Davis
Thank you Denis.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web