Entering edit mode
12 months ago
naveennavinbeast
•
0
I am doing ML analysis for feature selection of my data I am using Boruta analysis. But I am getting error when I pass my array into boruta fit. Can any one help me resolve the issue.
The code is:
from boruta import BorutaPy
from sklearn.ensemble import RandomForestClassifier
X1 = final_data.drop("sample_title", axis=1)
y1 = final_data.loc[:, ["sample_title"]]
# Convert y1 to numeric values (assuming they are in string format)
ya = y1.astype(float).values.ravel()
Xn = X1.astype('int_')
yb = ya.astype('int_')
rf1 = RandomForestClassifier(n_estimators=100, random_state=1)
boruta_feature = BorutaPy(rf1, n_estimators='auto', random_state=1)
boruta_feature.fit(Xn, ya)
# Check which features are selected (support)
selected_features = X1.columns[boruta_feature.support_].to_list()
print('Accepted features:', selected_features)
The error message is
AttributeError Traceback (most recent call last)
Cell In[57], line 13
11 rf1 = RandomForestClassifier(n_estimators=100, random_state=1)
12 boruta_feature = BorutaPy(rf1, n_estimators='auto', random_state=1)
---> 13 boruta_feature.fit(Xn, ya)
15 # Check which features are selected (support)
16 selected_features = X1.columns[boruta_feature.support_].to_list()
File ~/.pyenv/versions/3.10.11/lib/python3.10/site-packages/boruta/boruta_py.py:201, in BorutaPy.fit(self, X, y)
188 def fit(self, X, y):
189 """
190 Fits the Boruta feature selection with the provided estimator.
191
(...)
198 The target values.
199 """
--> 201 return self._fit(X, y)
File ~/.pyenv/versions/3.10.11/lib/python3.10/site-packages/boruta/boruta_py.py:260, in BorutaPy._fit(self, X, y)
255 _iter = 1
256 # holds the decision about each feature:
257 # 0 - default state = tentative in original code
258 # 1 - accepted in original code
259 # -1 - rejected in original code
--> 260 dec_reg = np.zeros(n_feat, dtype=np.int)
261 # counts how many times a given feature was more important than
262 # the best of the shadow features
263 hit_reg = np.zeros(n_feat, dtype=np.int)
File ~/.pyenv/versions/3.10.11/lib/python3.10/site-packages/numpy/__init__.py:324, in __getattr__(attr)
319 warnings.warn(
320 f"In the future `np.{attr}` will be defined as the "
321 "corresponding NumPy scalar.", FutureWarning, stacklevel=2)
323 if attr in __former_attrs__:
--> 324 raise AttributeError(__former_attrs__[attr])
326 if attr == 'testing':
327 import numpy.testing as testing
AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
Copy pasted the whole error message for better understanding. I am a beginner but tried several possibilities for three days and then posting the error
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.Thankyou Dear Ram. Apologies for the way that I posted. But this is my first time that I am using Biostars. Hereafter I will follow the way you have instructed me. Thanks for your generous help in structring my query and suggestions in the way to post. Hope the way you edited will give better understanding for experts to give me valuble inputs
I'm going to assume there was a misunderstanding because of the way I phrased my initial comment and that's why all of your text is in code formatting.
The instructions were given to help you format the code parts of your post as code. Formatting all of it as code takes away the meaning from formatting the computer generated/input-output parts of the post as code. Look at how I formatted your post and see if you can notice the difference between the parts I used code formatting for and the parts I did not use it for.