Eigen::internal::generic_reciprocal_newton_step< Packet, Steps > Struct Template Reference

#include <MathFunctionsImpl.h>

Static Public Member Functions

static EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Packet run (const Packet &a, const Packet &approx_a_recip)
 

Detailed Description

template<typename Packet, int Steps>
struct Eigen::internal::generic_reciprocal_newton_step< Packet, Steps >

Fast reciprocal using Newton-Raphson's method.

Preconditions:

  1. The starting guess provided in approx_a_recip must have at least half the leading mantissa bits in the correct result, such that a single Newton-Raphson step is sufficient to get within 1-2 ulps of the correct result.
  2. If a is zero, approx_a_recip must be infinite with the same sign as a.
  3. If a is infinite, approx_a_recip must be zero with the same sign as a.

If the preconditions are satisfied, which they are for for the _*_rcp_ps instructions on x86, the result has a maximum relative error of 2 ulps, and correctly handles reciprocals of zero, infinity, and NaN.

Member Function Documentation

◆ run()

template<typename Packet , int Steps>
static EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Packet Eigen::internal::generic_reciprocal_newton_step< Packet, Steps >::run ( const Packet a,
const Packet approx_a_recip 
)
inlinestatic
38  {
39  using Scalar = typename unpacket_traits<Packet>::type;
40  const Packet two = pset1<Packet>(Scalar(2));
41  // Refine the approximation using one Newton-Raphson step:
42  // x_{i} = x_{i-1} * (2 - a * x_{i-1})
44  const Packet tmp = pnmadd(a, x, two);
45  // If tmp is NaN, it means that a is either +/-0 or +/-Inf.
46  // In this case return the approximation directly.
47  const Packet is_not_nan = pcmp_eq(tmp, tmp);
48  return pselect(is_not_nan, pmul(x, tmp), x);
49  }
SCALAR Scalar
Definition: bench_gemm.cpp:45
const Scalar * a
Definition: level2_cplx_impl.h:32
Eigen::Matrix< Scalar, Dynamic, Dynamic, ColMajor > tmp
Definition: level3_impl.h:365
EIGEN_STRONG_INLINE Packet4cf pmul(const Packet4cf &a, const Packet4cf &b)
Definition: AVX/Complex.h:88
EIGEN_STRONG_INLINE Packet2cf pcmp_eq(const Packet2cf &a, const Packet2cf &b)
Definition: AltiVec/Complex.h:353
EIGEN_STRONG_INLINE Packet4f pnmadd(const Packet4f &a, const Packet4f &b, const Packet4f &c)
Definition: LSX/PacketMath.h:827
EIGEN_STRONG_INLINE Packet4f pselect(const Packet4f &mask, const Packet4f &a, const Packet4f &b)
Definition: AltiVec/PacketMath.h:1474
list x
Definition: plotDoE.py:28
static EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Packet run(const Packet &a, const Packet &approx_a_recip)
Definition: MathFunctionsImpl.h:38
T type
Definition: GenericPacketMath.h:135

References a, Eigen::internal::pcmp_eq(), Eigen::internal::pmul(), Eigen::internal::pnmadd(), Eigen::internal::pselect(), tmp, and plotDoE::x.


The documentation for this struct was generated from the following file: