Skip to content

Commit

Permalink
[Mosaic:TPU] Efficient relayout with internal scratch
Browse files Browse the repository at this point in the history
We should support all different retilings (x*packing1, 128) <-> (y*packing2, 128) with any dtype in this cl at this moment. The efficient relayout with scratch brings significant improvements on current retiling in <= TPUv4 and retiling with (packing, 128) in TPUv5. All missing retiling supports are added in this cl, including increase sublane retiling and packed type retiling.

PiperOrigin-RevId: 676982957
  • Loading branch information
bythew3i authored and Google-ML-Automation committed Sep 20, 2024
1 parent a533635 commit 6b93b35
Show file tree
Hide file tree
Showing 4 changed files with 439 additions and 135 deletions.
1 change: 1 addition & 0 deletions jaxlib/mosaic/dialect/tpu/tpu.td
Original file line number Diff line number Diff line change
Expand Up @@ -790,6 +790,7 @@ def ApplyVectorLayoutPass : Pass<"tpu-apply-vector-layout", "::mlir::func::FuncO
Option<"mxu_contracting_size", "mxu-contracting-size", "int", /*default=*/"128", "">,
Option<"mxu_noncontracting_size", "mxu-noncontracting-size", "int", /*default=*/"128", "">,
Option<"max_sublanes_in_scratch", "max-sublanes-in-scratch", "int", /*default=*/"0", "">,
Option<"vmem_banks", "vmem-banks", "int", /*default=*/"-1", "">,
];
}

Expand Down
1 change: 1 addition & 0 deletions jaxlib/mosaic/dialect/tpu/tpu_dialect.h
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ struct ApplyVectorLayoutContext {
// mxu_shape = {contracting_size, non_contracting_size}
std::array<int64_t, 2> mxu_shape = {128, 128};
int64_t max_sublanes_in_scratch = 0;
int64_t vmem_banks = -1; // -1 means "unspecified".
};

std::pair<bool, bool> mightCommunicateBetweenChips(Operation* op);
Expand Down
Loading

0 comments on commit 6b93b35

Please sign in to comment.