Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better read/write interface? #22

Open
stanleyjs opened this issue Feb 11, 2024 · 0 comments
Open

better read/write interface? #22

stanleyjs opened this issue Feb 11, 2024 · 0 comments

Comments

@stanleyjs
Copy link

Hello,

We are running sanity in a pipeline that's primarily implemented in python. Our datasets can be quite large. Our performance is really being crippled by Sanity's I/O interface. As I understand it, sanity expects a matrix market format file and outputs a csv.

Our data is already stored in memory in a python parent process, and we launch sanity with a subprocess.
Is there a way to more quickly send and receive data from sanity? Right now we are stuck waiting on Sanity to write an enormous csv file, and then we have to read that enormous CSV file back into memory in the parent process.

The most obvious solution to me is to write some Python-to-C interface for Sanity using CDLL / ctypes. I wonder if you guys have any plans for this, or any tips to speed up interfacing with Sanity without hitting the disk so much?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant