Moved discussion in #462 into this separate issue.
Other PythonCall issues related to NumPy dtype and Numpy arrays that may be relevant:
My system details (click to expand)
Julia
julia> versioninfo()
Julia Version 1.12.1
Commit ba1e628ee49 (2025-10-17 13:02 UTC)
Build Info:
Official https://julialang.org release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 48 × AMD EPYC 7V13 64-Core Processor
WORD_SIZE: 64
LLVM: libLLVM-18.1.7 (ORCJIT, znver3)
GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 48 virtual cores)
julia> Pkg.status()
Status `~/temp/Project.toml`
[992eb4ea] CondaPkg v0.2.33
[6099a3de] PythonCall v0.9.28
Python Environment
- python 3.13.9
- numpy 2.3.4
I noticed a difference in how the PythonCall to_numpy() function and NumPy treat property values of dtype as follows:
julia> using PythonCall
julia> Py(rand(10)).to_numpy().dtype
Python: dtype('float64')
julia> Py([[1,2,3], [4,5,6]]).to_numpy().dtype
Python: dtype('O')
I would expect the latter to be dtype('int64') to match Python:
>>> import numpy
>>> a = numpy.array([[1,2,3], [4,5,6]])
>>> a.dtype
dtype('int64')
While Julia does provide Matrix{Int64} and to_numpy() outputs dtype('int64') as I would expect
julia> [1 2 3; 4 5 6] |> typeof
Matrix{Int64} (alias for Array{Int64, 2})
julia> Py([1 2 3; 4 5 6]).to_numpy().dtype
Python: dtype('int64')
I am working on a project where I need to use PythonCall to deal with Vector{Vector{Int64}} instances and changing into Matrix{Int64} is not an option.
In Julia, using PythonCall to_numpy(), it is clear that dtype('int64') is only output for Vector{Int64} not Vector{Vector{Int64}} nor Vector{Vector{Vector{Int64}}} regardless of how many levels of nesting:
julia> Py([1, 2, 3]).to_numpy().dtype
Python: dtype('int64')
julia> Py([[1,2,3], [4,5,6]]).to_numpy().dtype
Python: dtype('O')
julia> Py([[[1,2,3], [4,5,6]], [[7,8,9], [10,11,12]]]).to_numpy().dtype
Python: dtype('O')
Whereas in Python, dtype('int64') is output for all the above:
>>> numpy.array([1, 2, 3]).dtype
dtype('int64')
>>> numpy.array([[1,2,3], [4,5,6]]).dtype
dtype('int64')
>>> numpy.array([[[1,2,3], [4,5,6]], [[7,8,9], [10,11,12]]]).dtype
dtype('int64')
Thus the value of the property .dtype in NumPy is defined based on the innermost elements in the array, whereas in Julia the value of .dtype upon using to_numpy() is not based on the innermost elements (i.e. 1, 2, 3, etc) but on the whole structure containing them (i.e Vector{Int64} for the 2nd array, and Vector{Vector{Int64}} for the 3rd array).
I was expecting the same behaviour from Python's NumPy and the conversions from to_numpy() given by PythonCall, but it turns out the conversion does not agree with NumPy on dtype property values.
Moved discussion in #462 into this separate issue.
Other PythonCall issues related to NumPy
dtypeand Numpy arrays that may be relevant:objecton Julia nightly #439My system details (click to expand)
Julia
Python Environment
I noticed a difference in how the PythonCall
to_numpy()function and NumPy treat property values ofdtypeas follows:I would expect the latter to be
dtype('int64')to match Python:While Julia does provide
Matrix{Int64}andto_numpy()outputsdtype('int64')as I would expectI am working on a project where I need to use PythonCall to deal with
Vector{Vector{Int64}}instances and changing intoMatrix{Int64}is not an option.In Julia, using PythonCall
to_numpy(), it is clear thatdtype('int64')is only output forVector{Int64}notVector{Vector{Int64}}norVector{Vector{Vector{Int64}}}regardless of how many levels of nesting:Whereas in Python,
dtype('int64')is output for all the above:Thus the value of the property
.dtypein NumPy is defined based on the innermost elements in the array, whereas in Julia the value of.dtypeupon usingto_numpy()is not based on the innermost elements (i.e. 1, 2, 3, etc) but on the whole structure containing them (i.eVector{Int64}for the 2nd array, andVector{Vector{Int64}}for the 3rd array).I was expecting the same behaviour from Python's NumPy and the conversions from
to_numpy()given by PythonCall, but it turns out the conversion does not agree with NumPy ondtypeproperty values.