Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > linux.debian.maint.python > #15644
| From | Julian Gilbey <jdg@debian.org> |
|---|---|
| Newsgroups | linux.debian.maint.python, linux.debian.devel, linux.debian.bugs.dist |
| Subject | Seeking a small group to package Apache Arrow (was: Bug#970021: RFP: apache-arrow -- cross-language development platform for in-memory analytics) |
| Date | 2024-03-25 19:20 +0100 |
| Message-ID | <IlX5n-1Dj0-1@gated-at.bofh.it> (permalink) |
| Organization | linux.* mail to news gateway |
Cross-posted to 3 groups.
Hi all,
[NB: sent to d-science, d-python, d-devel and the RFP bug; reply-to
set to d-science and the RFP bug only]
An update on Apache Arrow, and in particular the Python library
PyArrow. For those who don't know:
Apache Arrow is a development platform for in-memory analytics. It
contains a set of technologies that enable big data systems to
process and move data fast. It specifies a standardized
language-independent columnar memory format for flat and
hierarchical data, organized for efficient analytic operations on
modern hardware.
The project is developing a multi-language collection of libraries
for solving systems problems related to in-memory analytical data
processing. This includes such topics as:
* Zero-copy shared memory and RPC-based data movement
* Reading and writing file formats (like CSV, Apache ORC, and Apache
Parquet)
* In-memory analytics and query processing
(from: https://arrow.apache.org/docs/index.html)
Pandas has announced that Pandas 3.x will depend on PyArrow
in a critical way (it will back the "string" datatype), and it is due
to be released imminently.
So this is a plea for anyone looking for something really helpful to
do: it would be great to have a group of developers finally package
this! There was some initial work done (see the RFP bug report for
details: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=970021),
but that is fairly old now. As Apache Arrow supports numerous
languages, it may well benefit from having a group of developers with
different areas of expertise to build it. (Or perhaps it would make
more sense to split the upstream source into a collection of different
Debian source packages for the different supported languages. I don't
know.) Unfortunately I don't have the capacity to devote any time to
it myself.
Thanks in advance for anyone who can step forward for this!
Best wishes,
Julian
Back to linux.debian.maint.python | Previous | Next — Next in thread | Find similar | Unroll thread
Seeking a small group to package Apache Arrow (was: Bug#970021: RFP: apache-arrow -- cross-language development platform for in-memory analytics) Julian Gilbey <jdg@debian.org> - 2024-03-25 19:20 +0100
Bug#970021: Seeking a small group to package Apache Arrow (was: Bug#970021: RFP: apache-arrow -- cross-language development platform for in-memory analytics) Rene Engelhard <rene@debian.org> - 2024-03-29 21:00 +0100
Bug#970021: Seeking a small group to package Apache Arrow (was: Bug#970021: RFP: apache-arrow -- cross-language development platform for in-memory analytics) Julian Gilbey <julian@d-and-j.net> - 2024-03-30 21:30 +0100
Bug#970021: Seeking a small group to package Apache Arrow (was: Bug#970021: RFP: apache-arrow -- cross-language development platform for in-memory analytics) Julian Gilbey <julian@d-and-j.net> - 2024-03-31 13:30 +0200
Bug#970021: Seeking a small group to package Apache Arrow (was: Bug#970021: RFP: apache-arrow -- cross-language development platform for in-memory analytics) Dirk Eddelbuettel <edd@debian.org> - 2024-03-31 14:00 +0200
Bug#970021: Seeking a small group to package Apache Arrow (was: Bug#970021: RFP: apache-arrow -- cross-language development platform for in-memory analytics) Richard Duivenvoorde <richard@duiv.nl> - 2024-04-06 09:52 +0200
csiph-web