Given an adjacency list: adj_list = [array([0,1]),array([0,1,2]),array([0,2])] And an array of indices, ind_arr = array([0,1,2]) Goal: A = np.zeros((3,3)) for i in ind_arr: A[i,list(adj_list[x])] = 1.0/float(adj_list[x].shape[0]) Currently, I have wr

I have a data file where the first 4 csv's are floats, and the last value is a string that represents a label for that row .5, .3, .2, .1, FAA .2., .3, .5., .2, FXX .5., .3, .2 , .9, FXX .3, .3, .9, .3, FCA I want to load the file into a numpy array

I have files with a certain format as follows: 36.1 37.1 A: Hi, how are you? 39.1 40.1 B: I am ok! I am using numpy.loadtxt() to read this file with dtype = np.dtype([('start', '|S1'), ('end', 'f8'),('person','|S1'),('content','|S100')]) The first 3

From this question and others it seems that it is not recommended to use concat or append to build a pandas dataframe because it is recopying the whole dataframe each time. My project involves retrieving a small amount of data every 30 seconds. This

I'm trying to generate a 3D distribution, where x, y represents the surface plane, and z is the magnitude of some value, distributed over a range. I'm looking at numpy's multivariate_normal, but it only lets me get a number of samples. I'd like the a

I'm working with python and have a dict where the keys are tuples with 3 values each. I'm computing another tuple with 3 values, and I want to find the tuple in the keys of the dict with the closest values to this newly computed tuple. How should I g

I have dataframe that looks like this +---+---+--- | A| B| C| +---+---+--- | 1| 3| 1| | 2| 1| 1| | 2| 3| 1| | 1| 2| 1| | 3| 1| 1| | 1| 2| 1| | 2| 1| 1| | 1| 3| 1| | 1| 2| 1| +---+---+--- I want to reduce the data to only the most frequent combination

I have several dataframes which have the same look but different data. DataFrame 1 bid close time 2016-05-24 00:00:00 NaN 2016-05-24 00:05:00 0.000611 2016-05-24 00:10:00 -0.000244 2016-05-24 00:15:00 -0.000122 DataFrame 2 bid close time 2016-05-24 0

I feel like there is some documentation I am missing, but I can't find anything on this specific example - everything is just about concatenating or stacking arrays. I have array x and array y both of shape (2,3) x = [[1,2,3],[4,5,6]] y = [[7,8,9],[1

I have two large data files, one with two columns and one with three columns. I want to select all the rows from the second file that are contained in the fist array. My idea was to compare the numpy arrays. Let's say I have: a = np.array([[1, 2, 3],

Have a data in such format in .txt file: UserId WordID 1 20 1 30 1 40 2 25 2 16 3 56 3 44 3 12 What I'm looking for- some function that can give the result grouping for every userid creating a list of wordid: [[20, 30, 40], [25, 16], [56, 44, 12]] Wh

I am working on price weighted indexes for a class and although it is a very simple calculation by hand I figured it would be good practice for my novice python skills. Edit So this is the code that I am working with now StockBPrice = np.array([35.1,

I have an array of numbers whose shape is 26*43264. I would like to reshape this into an array of shape 208*208 but in chunks of 26*26. [[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10,11,12,13,14,15,16,17,18,19]] becomes something like: [[0, 1, 2, 3, 4], [10,1

Assume that I have a Python project structure as: main.py which imports random_initialization.py main.py which imports sample_around_solution.py Both random_initialization and sample_around_solution.py import numpy. Now, random_initialization starts

When I try to get just the first element of an array like this import numpy a = numpy.array([1,2]) a[:,0] I get this error --------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipyth

I saw a very strange behavior in numpy array, when I mixed int32 and int8 arrays in a simple operation, the int32 array element ct[4,0] seems to have become 8bit when taking the result of += dleng[4]*4: import numpy as np In[3]: ct = np.zeros((6,1),

When install pandas, it requires numpy to be installed and on installing it gives following error: Processing numpy-1.9.1.zip Writing c:\cygwin64\tmp\easy_install-4x5clr\numpy-1.9.1\setup.cfg Running numpy-1.9.1\setup.py -q bdist_egg --dist-dir c:\cy

I guess I'm having a slow day and can't figure this one out. I have an m x n numpy array and want to convert it to a vector where each element is a 3 dimensional vector containing the row number, column number and value of all the elements in the arr

I have a number of lists (time series) dictionary = {'a': [1,2,3,4,5], 'b': [5,2,3,4,1], 'c': [1,3,5,4,6]} that I would like to average on another: merged = {'m': [2.33,2.33,3.66,4.0,4.0]} Is there a smart way to find this? What if the lists have dif

This question already has an answer here: Find nearest value in numpy array 11 answers The given value is 6.6. But the value 6.6 is not in the array (data below). But the nearest value to the given value is 6.7. How can I get this position? import nu

I am trying to set the values in a numpy array to zero if it is equivalent to any number in a list. Lets consider the following array a = numpy.array([[1, 2, 3], [4, 8, 6], [7, 8, 9]]) I want to set multiple elements of a which are in the list [1, 2,

I have two vectors; one for hours in the day [1,2,3,...,24], and the second for days in the year [1,2,3,4,5,6,...,365] I would like to construct a matrix of 24*365 cells, 24 rows and 365 columns. Something like: a = [(1,24),(2,24),(3,24),(4,24),(5,24

This is similar to a question asked on the programming Stack Exchange: https://softwareengineering.stackexchange.com/questions/158247/binary-representation-in-python-and-keeping-leading-zeros Essentially, I have some numbers that I keep track of in h

I'd like to make a set of comparable empirical CDFs for a few numpy arrays (each of different length) and store these in a pandas dataframe: a = scipy.randn(100) b = scipy.randn(500) # ECDF from statmodels cdf_a = ECDF(a) cdf_b = ECDF(b) The problem

I am trying to freeze a Python script with cx_Freeze. The script makes use of pandas. When I run the executable created by cx_Freeze, I get the following Traceback: [...] File "C:\Python27\lib\site-packages\pandas\__init__.py", line 6, in <mo

Let's say I have some 32-bit numbers and some 64-bit numbers: >>> import numpy as np >>> w = np.float32(2.4) >>> x = np.float32(4.555555555555555) >>> y = np.float64(2.4) >>> z = np.float64(4.555555555555555) I

My impression is that in NumPy, two arrays can share the same memory. Take the following example: import numpy as np a=np.arange(27) b=a.reshape((3,3,3)) a[0]=5000 print (b[0,0,0]) #5000 #Some tests: a.data is b.data #False a.data == b.data #True c=n

In a java application I need to use a specific image processing algorithm that is currently implemented in python. What would be the best approach, knowing that this script uses the Numpy library ? I alreayd tried to compile the script to java using

I'm using python and numpy/scipy to do regex and stemming for a text processing application. But I want to use some of R's statistical packages as well. What's the best way to pass the data from python to R? (And back?) Also, I need to backup the arr

I want to increment a small subsection (variable) of an matrix [illustrative code below] - but running over them by loops seems sloppy and inelegant -- and I suspect is the slowest way to do this calc. One of the ideas I had was to create another arr