GeneralScalarContraction is a template class that provides the scalar value of Tensor -Tensor contraction operation, when all the dimensions are contracting dimensions. This Kernel reduces two tensors to an scalar.
More...
template<typename OutScalar, typename LhsScalar, typename RhsScalar, typename OutAccessor, typename LhsMapper, typename RhsMapper, typename StorageIndex, bool Vectorizable>
struct Eigen::TensorSycl::internal::GeneralScalarContraction< OutScalar, LhsScalar, RhsScalar, OutAccessor, LhsMapper, RhsMapper, StorageIndex, Vectorizable >
GeneralScalarContraction is a template class that provides the scalar value of Tensor -Tensor contraction operation, when all the dimensions are contracting dimensions. This Kernel reduces two tensors to an scalar.
- Template Parameters
-
| OutScalar | determines the output scalar type |
| LhsScalar | determines the left-hand-side scalar type |
| RhsScalar | determines the right-hand-side scalar type |
| OutAccessor | determines the sycl accessor type for out put (please see the sycl-1.2.1 specification (https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf) for accessor definition) |
| LhsMapper | determines the tensor contraction mapper type for left-hand-side matrix |
| RhsMapper | determines the tensor contraction mapper type for right-hand-side matrix |
| StorageIndex | determines the StorageIndex Type |
| Vectorizable | determines whether or not the vectorization is enabled for the Eigen expression. |
- Parameters
-
| scratch | local memory containing tiles of LHS and RHS tensors for each work-group |
| lhs | determines the left-hand-side flattened tensor (tensor mapper) |
| rhs | determines the right-hand-side flattened tensor (tensor mapper) |
| out_res | determines the output tensor containing the contraction result |
| rng | determines the total input data size |
template<typename OutScalar , typename LhsScalar , typename RhsScalar , typename OutAccessor , typename LhsMapper , typename RhsMapper , typename StorageIndex , bool Vectorizable>
1249 OutScalar *scratch_ptr =
scratch.get_pointer();
1251 StorageIndex globalid = itemID.get_global_id(0);
1252 StorageIndex localid = itemID.get_local_id(0);
1253 OutScalar accumulator = OutScalar(0);
1254 for (StorageIndex
i = globalid;
i <
rng;
i += itemID.get_global_range(0)) {
1257 auto out_scratch_ptr = scratch_ptr + localid;
1258 *out_scratch_ptr = accumulator;
1259 for (StorageIndex offset = itemID.get_local_range(0) >> 1; offset > 0; offset >>= 1) {
1260 itemID.barrier(cl::sycl::access::fence_space::local_space);
1261 if (localid < offset) {
1262 *out_scratch_ptr = (accumulator += out_scratch_ptr[offset]);
1266 out_ptr[itemID.get_group(0)] = accumulator;
int i
Definition: BiCGSTAB_step_by_step.cpp:9
EIGEN_STRONG_INLINE Packet4f pmadd(const Packet4f &a, const Packet4f &b, const Packet4f &c)
Definition: AltiVec/PacketMath.h:1218
References i, Eigen::TensorSycl::internal::GeneralScalarContraction< OutScalar, LhsScalar, RhsScalar, OutAccessor, LhsMapper, RhsMapper, StorageIndex, Vectorizable >::lhs, Eigen::TensorSycl::internal::GeneralScalarContraction< OutScalar, LhsScalar, RhsScalar, OutAccessor, LhsMapper, RhsMapper, StorageIndex, Vectorizable >::out_res, Eigen::internal::pmadd(), Eigen::TensorSycl::internal::GeneralScalarContraction< OutScalar, LhsScalar, RhsScalar, OutAccessor, LhsMapper, RhsMapper, StorageIndex, Vectorizable >::rhs, Eigen::TensorSycl::internal::GeneralScalarContraction< OutScalar, LhsScalar, RhsScalar, OutAccessor, LhsMapper, RhsMapper, StorageIndex, Vectorizable >::rng, and Eigen::TensorSycl::internal::GeneralScalarContraction< OutScalar, LhsScalar, RhsScalar, OutAccessor, LhsMapper, RhsMapper, StorageIndex, Vectorizable >::scratch.