...menustart
...menuend
- Use the shape method to find the dimensions of the array. (rows, columns)
m.shape # (2, 3)
arange
returns evenly spaced values within a given interval.np.arange(0, 30, 2) # start at 0 count up by 2, stop before 30
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])
reshape
returns an array with the same data with a new shape.n.reshape(3, 5) # reshape array to be 3x5
array([[ 0, 2, 4, 6, 8], [10, 12, 14, 16, 18], [20, 22, 24, 26, 28]])
linspace
returns evenly spaced numbers over a specified interval.o = np.linspace(0, 4, 9)
array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])
resize
changes the shape and size of array in-place.o.resize(3, 3)
array([[ 0. , 0.5, 1. ], [ 1.5, 2. , 2.5], [ 3. , 3.5, 4. ]])
>>> p = np.ones([2, 3], int)
>>> p
>>> array([[1, 1, 1],
[1, 1, 1]])
- Use
vstack
to stack arrays in sequence vertically (row wise).
>>> np.vstack([p, 2*p])
>>> array([[1, 1, 1],
[1, 1, 1],
[2, 2, 2],
[2, 2, 2]])
- Use hstack to stack arrays in sequence horizontally (column wise).
>>> np.hstack([p, 2*p])
>>> array([[1, 1, 1, 2, 2, 2],
[1, 1, 1, 2, 2, 2]])
- Dot Product:
x.dot(y)
- Use .T to get the transpose.
z.T
- Use .dtype to see the data type of the elements in the array.
z.dtype
- Use .astype to cast to a specific type.
z = z.astype('f')
- Let's create a new 4 by 3 array of random numbers 0-9.
test = np.random.randint(0, 10, (4,3))
test
array([[5, 2, 9],
[6, 0, 1],
[0, 4, 5],
[3, 8, 0]])
- Use zip to iterate over multiple iterables.
test2 = test**2
test2
array([[25, 4, 81],
[36, 0, 1],
[ 0, 16, 25],
[ 9, 64, 0]])
for i, j in zip(test, test2):
print(i,'+',j,'=',i+j)
[5 2 9] + [25 4 81] = [30 6 90]
[6 0 1] + [36 0 1] = [42 0 2]
[0 4 5] + [ 0 16 25] = [ 0 20 30]
[3 8 0] + [ 9 64 0] = [12 72 0]
idxs = np.flatnonzero(y_train == y)
idxs = np.random.choice(idxs, samples_per_class, replace=False)
>>> arr.argsort()[:n] # n-smallest
>>> arr.argsort()[-n:][::-1] # n-largest
numpy 可以把多个维度的slicing 放在一个方括号内...
-
single colon : everything
-
slicing is just a view into original array, it won't create a copy.
- use
r.copy
to create a copy that will not affect the original arrayr_copy = r.copy()
- use
Also called masking or boolean indexing , or logical indexing.
- this 1st orange case is 2 fancy index, while the other 2 are mixing of regular slicing and fancy indexing.
Fancy Indexing in 2-D
Unlike sliciing, fancy indexing creates copies instead of a view into original array.
- arange
- linspace
- array
- zeros
- ones
-
>>> np.eye(3) array([[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]])
- Rule 1: Operations between multiple array objects are first checked for proper shape match
- Broadcasting rule
- shape must be same , or compatible
- Rule 2: Mathematical operators (
+ - * / exp log
) apply element by element, on the values - Rule 3: Reduction operations ( mean, std, skew, kurt, sum, prod, ... ) apply to the whole array, unless an axis is specified.
- Rule 4: Missing values propagate unless explicitly ignored( nanmean, nansum, ... )
- Broadcasting rule
- Reduction operations
- Mathematical functions
- sum, prod
- min, max, argmin, argmax
- ptp (max-min), peak to peak
- Statistics
- mean, std, var
- Truth value testing
- any, all
- UNRAVELING
- reduction operations take our multi-dimensional array and flatten it into just 1D, which is kind of annoying because usually we care about the fact that our data is multi-dimensional.
- there's a super helpful function called
unravel_index
.
>>> a = np.arange(12).reshape(3,4) >>> np.unravel_index( a.argmax(), a.shape ) (2, 3)
- it does have 1 downside which is if you're there are multiple maxima or multiple minima, it'll only give you the coordinates of the first one that it encouters, that is why
where
kicks in.
-
>>> np.where( a.max(0) ) (array([0, 1, 2, 3]),) >>> np.where( a.max(0) >= 10 ) (array([2, 3]),)
- Coordinates are returned as a tuple of arrays, one for each axis
# in memory
0 1 2 3 4 5
# python 2d array
0 1 2
3 4 5
- Data in memory is 1D, in python, data is multi-dimensional. So somehow we need some information to map between the data in memory to the multi-dimensional array in our program.
- dtype: int64 (for example)
- shape: (2,3)
- ndim: 2
- data: pointer to the underlying memory
- strides: (24,8)
- the number of bytes that numpy has to jump over to go from one element to the other along each dimension.
- to go from 0 to 3 in the original memory buffer it has to jump over 24 bytes. to jump from 0 to 1 it jumps 8 bytes.
- a.T
- shape: (3,2)
- strides: (8,24)
- a[:, ::2] first and last column
- shape: (2,2)
- striders: (24,16)
- when you using fancy indexing, numpy give up, and is gonna create a new array.
- flatten
- return a copy of the original data
- revel
- return a reference (or view) if possible(i.e. the memory is contiguous)
- otherwise copy the original data