Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable macOS M1 wheel builds #166

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

nicholasjng
Copy link

This commit contains fixes for ml-metadata's build setup to enable wheel builds on macOS M1.

The core build files have not been touched except for version increments of Bazel dependencies, and for mariadbconnector-c, whose build file needed to be updated to support the latest version.

Several dependencies were upgraded to newer versions, with the update policy being "as recent as possible without breaking changes". A few gRPC dependencies were eliminated from ml-metadata's workspace to avoid friction on subsequent gRPC updates (this includes e.g. protobuf, zlib).

This change allows for M1 wheel builds without more external system dependencies outside of Bazel - the used build definitions contain M1 capabilities for sufficiently new versions of Bazel.

@nicholasjng
Copy link
Author

nicholasjng commented Oct 31, 2022

TODO:

  • Some comments are wrong/out of date (e.g. in the WORKSPACE), fix them up
  • Guard the -D_XOPEN_SOURCE macro behind a Bazel select to not disturb other platforms

@nicholasjng
Copy link
Author

Rebased on current master and fixed the above todos.

@BrianSong
Copy link
Collaborator

Thanks, @nicholasjng! I am currently testing this PR with our internal tests around OSS code, would come back to you soon.

@pivonroll
Copy link

@BrianSong Any news on test front? Did PR pass your OSS tests?

@BrianSong
Copy link
Collaborator

Unfortunately, this is blocked by the internal test integration service does not support M1 Mac yet - which is still at design stage at this point. This is not likely to be finished within Q1.

@BrianSong
Copy link
Collaborator

FYI, I have transferred to another team. Please reach out to @ryanpeters-google for further questions as he is the owner for MLMD OSS now.

@XinranTang
Copy link
Collaborator

@nicholasjng Due to recent MLMD updates, it seems some conflicts should be resolved before we can merge this PR. Would you like to take a look at it?

@nicholasjng
Copy link
Author

Sure thing @XinranTang , would you mind and get me up to speed on the changes? That would help me be quicker on this, don't have much free time at the moment

@XinranTang
Copy link
Collaborator

@nicholasjng May I kindly ask what's the extent of urgency / priority of checking in this PR?

@nicholasjng
Copy link
Author

It has been open for close to 9 months now, so it is not day-to-day urgent. But it does enable Mac ARM users, which I guess is something that the project owners would appreciate? I can have a look when I get time.

@codesue
Copy link

codesue commented Jul 25, 2023

Hi @XinranTang, thank you for taking another look at this PR and continuing this conversation! Issue #143 and other long-standing issues like tensorflow/tfx-bsl#48, tensorflow/transform#297, tensorflow/serving#1816, tensorflow/data-validation#205, and tensorflow/model-card-toolkit#264 are major blockers for developers with Apple Silicon machines who want to use libraries across the TensorFlow ecosystem. There's more context in this TensorFlow Forum discussion. It would be really great to see this prioritized by the MLMD team because there aren't any good workarounds.

@XinranTang
Copy link
Collaborator

Hi @codesue Thank you for providing useful artifacts for us to estimate the priority of continuing resolving this PR. Let me discuss with the team and keep you updated.

@nicholasjng
Copy link
Author

In my opinion, remaining on protobuf@3 for Python will probably not work due to their M1 wheel problems in the past. AFAIK, fixed it in 4, but it would be helpful if you could confirm that bumping protobuf's major version is fine (I know you guys keep dependencies in lockstep across the TF ecosystem).

Some context: protocolbuffers/protobuf#8820 (comment)

This commit contains fixes for `ml-metadata`'s build setup to enable wheel builds on macOS M1.

The core build files have not been touched except for version increments of Bazel dependencies, and for `mariadbconnector-c`, whose build file needed to be updated to support
the latest version.

Several dependencies were upgraded to newer versions, with the update policy being "as recent as possible without breaking changes". A few gRPC dependencies were eliminated from
`ml-metadata`'s workspace to avoid friction on subsequent gRPC updates (this includes e.g. `protobuf`, `zlib`).

This change allows for M1 wheel builds without more external system dependencies outside of Bazel - the used build definitions contain M1 capabilities for sufficiently new versions of Bazel.
@nicholasjng
Copy link
Author

@tangm I think @thesuperzapper's solution to making the postgres genrule portable from #188 is much better, since it shortens the diff dramatically (guarding __get_cpuid behind an x86 preprocessor def instead of a Bazel select), and it also works on Linux ARM64.

Let's talk about how to merge these two - I still think the dependency changes are necessary, but we should try and limit them as much as possible.

@tangm
Copy link

tangm commented Dec 13, 2023

@tangm I think @thesuperzapper's solution to making the postgres genrule portable from #188 is much better, since it shortens the diff dramatically (guarding __get_cpuid behind an x86 preprocessor def instead of a Bazel select), and it also works on Linux ARM64.

I think using a pre-established approach that the maintainers are comfortable with (taking from tensorflow/io and modifying that) makes sense. @thesuperzapper updating the existing "port" from tfio seems to be the way to go. I do think his changes are somewhat more involved than getting m1 wheel builds going, so maybe just using the same conditional approach without the other changes might help limit scope. The windows references and single condition selects should probably still go though...

On a separate note, I'm a little uncomfortable with the general approach in the postgres.build, but not familiar enough with bazel so have a good alternative:

  • Using #define and select in cc_library/defines makes it less consistent/clear when to choose what since there are two possibilities.
  • The genrule seems to be trying to do the work of what configure does in a configure/make build approach. I'm sure there are reasons for this, but this seems to be a source for the errors we're trying to get around.

@ConfuzedCoder
Copy link

Hi @XinranTang

Are there any updates on this issue? Additionally, do you have an estimated timeline for when it might be resolved?

I am a Silicon user and I am not able to use TFX for the last 3 years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants