Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS compatibility? #14

Open
vade opened this issue Jul 18, 2020 · 4 comments
Open

macOS compatibility? #14

vade opened this issue Jul 18, 2020 · 4 comments

Comments

@vade
Copy link

vade commented Jul 18, 2020

Hello

Firstly, thank you for this repo and your work. Im able to run your examples on iOS sans issue.

I am attempting to run your example code on a simple macOS test harness. However, I am not getting expected results. Prediction runs, I can load the model, load the config, the anchors, configure Vision, the request, provide an image, however the VNCoreMLRequest results always have 0 score within Detection.detectionsFromFeatureValue - and I'm trying to (unsuccessfully debug).

Verified:

  • properly configure MaskRCNNConfig.defaultConfig
  • load model
  • set up vision model
  • fetch image, make CIImage
  • set up request
  • set up handler
  • run predict
  • get results with 2 VNCoreMLFeatureValueObservation

However I don't seem to get valid scores for what appear to be valid input images. I can verify that iOS on the same image works and makes a great prediction, bounding box and mask.

Have you been able to run this code on macOS?

@vade
Copy link
Author

vade commented Jul 18, 2020

macOS computed feature for image 'a'

MultiArray : Double 100 x 6 matrix
[0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0]

iOS computed feature for the same image 'a'

Double 100 x 6 matrix
[0.2269468903541565,0.246346652507782,0.7716894745826721,0.9700236916542053,1,0.9998897314071655;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0;
 0,0,0,0,0,0]

@vade
Copy link
Author

vade commented Jul 18, 2020

Dropping directly into CoreML rather than vision produces the same result. I manually resize a CGImageRef to 1024x1024 and convert it to a CVPixelBufferRef and pass it to MaskRCNN.prediction(image:pixelBuffer) and get the same as vision, an empty MLFeatureValue same as above.

@vade
Copy link
Author

vade commented Jul 18, 2020

Apologies for the monologue - here is an interesting observation on the issue:

It appears that custom layers on CoreML models are loaded slightly different on iOS than on macOS - at least with a sample size of 1 for this MaskRCNN.

To realize this I added some debug logging to the custom layer initializers and functions.

when iOS loads the CoreML model and runs inference (prediction is called), we see:

2020-07-18 18:00:30.364779-0400 Example[5118:1071328] Metal GPU Frame Capture Enabled
2020-07-18 18:00:30.366195-0400 Example[5118:1071328] Metal API Validation Enabled
init(parameters:) ["nmsIOUThreshold": 0.7, "bboxStdDev_3": 0.2, "bboxStdDev_1": 0.1, "engineName": ProposalLayer, "preNMSMaxProposals": 6000, "maxProposals": 1000, "bboxStdDev_2": 0.2, "bboxStdDev_0": 0.1, "bboxStdDev_count": 4]
init(parameters:) ["imageHeight": 1024, "imageWidth": 1024, "engineName": PyramidROIAlignLayer, "poolSize": 7]
init(parameters:) ["engineName": TimeDistributedClassifierLayer]
init(parameters:) ["maxDetections": 100, "bboxStdDev_1": 0.1, "engineName": DetectionLayer, "bboxStdDev_3": 0.2, "bboxStdDev_0": 0.1, "scoreThreshold": 0.7, "nmsIOUThreshold": 0.3, "bboxStdDev_2": 0.2, "bboxStdDev_count": 4]
init(parameters:) ["imageWidth": 1024, "engineName": PyramidROIAlignLayer, "poolSize": 14, "imageHeight": 1024]
init(parameters:) ["engineName": TimeDistributedMaskLayer]
2020-07-18 18:00:30.935153-0400 Example[5118:1071289] [discovery] errors encountered while discovering extensions: Error Domain=PlugInKit Code=13 "query cancelled" UserInfo={NSLocalizedDescription=query cancelled}
init(parameters:) ["bboxStdDev_2": 0.2, "bboxStdDev_3": 0.2, "bboxStdDev_count": 4, "maxProposals": 1000, "nmsIOUThreshold": 0.7, "preNMSMaxProposals": 6000, "engineName": ProposalLayer, "bboxStdDev_0": 0.1, "bboxStdDev_1": 0.1]
init(parameters:) ["engineName": PyramidROIAlignLayer, "imageHeight": 1024, "imageWidth": 1024, "poolSize": 7]
init(parameters:) ["engineName": TimeDistributedClassifierLayer]
init(parameters:) ["scoreThreshold": 0.7, "bboxStdDev_count": 4, "bboxStdDev_1": 0.1, "engineName": DetectionLayer, "bboxStdDev_2": 0.2, "maxDetections": 100, "bboxStdDev_0": 0.1, "nmsIOUThreshold": 0.3, "bboxStdDev_3": 0.2]
init(parameters:) ["imageHeight": 1024, "engineName": PyramidROIAlignLayer, "imageWidth": 1024, "poolSize": 14]
init(parameters:) ["engineName": TimeDistributedMaskLayer]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[1, 0, 0, 0, 0]]
outputShapes(forInputShapes:) [[261888, 1, 2, 1, 1], [261888, 1, 4, 1, 1]] (Function)
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[1000, 1, 256, 7, 7]]
outputShapes(forInputShapes:) [[1000, 1, 256, 7, 7]] [[1000, 1, 1, 1, 6]]
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1000, 1, 1, 1, 6]] [[100, 1, 6, 1, 1]]
outputShapes(forInputShapes:) [[100, 1, 6, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[100, 1, 256, 14, 14]]
outputShapes(forInputShapes:) [[100, 1, 256, 14, 14], [100, 1, 6, 1, 1]] [[1, 1, 100, 28, 28]]
outputShapes(forInputShapes:) [[261888, 1, 2, 1, 1], [261888, 1, 4, 1, 1]] (Function)
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[1000, 1, 256, 7, 7]]
outputShapes(forInputShapes:) [[1000, 1, 256, 7, 7]] [[1000, 1, 1, 1, 6]]
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1000, 1, 1, 1, 6]] [[100, 1, 6, 1, 1]]
outputShapes(forInputShapes:) [[100, 1, 6, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[100, 1, 256, 14, 14]]
outputShapes(forInputShapes:) [[100, 1, 256, 14, 14], [100, 1, 6, 1, 1]] [[1, 1, 100, 28, 28]]
outputShapes(forInputShapes:) [[261888, 1, 2, 1, 1], [261888, 1, 4, 1, 1]] (Function)
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[1000, 1, 256, 7, 7]]
outputShapes(forInputShapes:) [[1000, 1, 256, 7, 7]] [[1000, 1, 1, 1, 6]]
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1000, 1, 1, 1, 6]] [[100, 1, 6, 1, 1]]
outputShapes(forInputShapes:) [[100, 1, 6, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[100, 1, 256, 14, 14]]
outputShapes(forInputShapes:) [[100, 1, 256, 14, 14], [100, 1, 6, 1, 1]] [[1, 1, 100, 28, 28]]
evaluate(inputs:outputs:) 2 1
evaluate(inputs:outputs:) 5 1
evaluate(inputs:outputs:) 1 1
evaluate(inputs:outputs:) 2 1
evaluate(inputs:outputs:) 5 1
evaluate(inputs:outputs:) 2 1
[Example.Detection(index: 0, boundingBox: (0.24634665250778198, 0.2269468903541565, 0.7236770391464233, 0.5447425842285156), classId: 1, score: 0.9998897314071655, mask: Optional(<CGImage 0x168800680> (DP)
	<(null)>
		width = 28, height = 28, bpc = 8, bpp = 8, row bytes = 28 
		kCGImageAlphaNone | 0 (default byte order)  | kCGImagePixelFormatPacked 
		is mask? Yes, has masking color? No, has soft mask? No, has matte? No, should interpolate? No))]

Which shows us initializers and expected keys / values, output sizes and shapes.

macOS however shows a 2 pass initialization - one with very different :

init(parameters:) ["bboxStdDev_3": 0.2, "bboxStdDev_count": 4, "nmsIOUThreshold": 0.7, "bboxStdDev_0": 0.1, "maxProposals": 1000, "bboxStdDev_1": 0.1, "bboxStdDev_2": 0.2, "preNMSMaxProposals": 6000, "engineName": ProposalLayer]
init(parameters:) ["engineName": PyramidROIAlignLayer, "poolSize": 7, "imageHeight": 1024, "imageWidth": 1024]
init(parameters:) ["engineName": TimeDistributedClassifierLayer]
init(parameters:) ["bboxStdDev_0": 0.1, "bboxStdDev_3": 0.2, "bboxStdDev_2": 0.2, "engineName": DetectionLayer, "bboxStdDev_1": 0.1, "bboxStdDev_count": 4, "nmsIOUThreshold": 0.3, "scoreThreshold": 0.7, "maxDetections": 100]
init(parameters:) ["poolSize": 14, "imageHeight": 1024, "engineName": PyramidROIAlignLayer, "imageWidth": 1024]
init(parameters:) ["engineName": TimeDistributedMaskLayer]
init(parameters:) ["nmsIOUThreshold": 0.7, "preNMSMaxProposals": 6000, "maxProposals": 1000, "bboxStdDev_0": 0.1, "bboxStdDev_3": 0.2, "bboxStdDev_2": 0.2, "bboxStdDev_count": 4, "engineName": ProposalLayer, "bboxStdDev_1": 0.1]
init(parameters:) ["poolSize": 7, "imageHeight": 1024, "engineName": PyramidROIAlignLayer, "imageWidth": 1024]
init(parameters:) ["engineName": TimeDistributedClassifierLayer]
init(parameters:) ["bboxStdDev_3": 0.2, "bboxStdDev_count": 4, "maxDetections": 100, "engineName": DetectionLayer, "nmsIOUThreshold": 0.3, "scoreThreshold": 0.7, "bboxStdDev_2": 0.2, "bboxStdDev_1": 0.1, "bboxStdDev_0": 0.1]
init(parameters:) ["imageHeight": 1024, "imageWidth": 1024, "poolSize": 14, "engineName": PyramidROIAlignLayer]
init(parameters:) ["engineName": TimeDistributedMaskLayer]
init(parameters:) ["bboxStdDev_count": 4, "preNMSMaxProposals": 6000, "maxProposals": 1000, "bboxStdDev_2": 0.2, "nmsIOUThreshold": 0.7, "bboxStdDev_3": 0.2, "bboxStdDev_0": 0.1, "engineName": ProposalLayer, "bboxStdDev_1": 0.1]
init(parameters:) ["engineName": PyramidROIAlignLayer, "poolSize": 7, "imageHeight": 1024, "imageWidth": 1024]
init(parameters:) ["engineName": TimeDistributedClassifierLayer]
init(parameters:) ["bboxStdDev_3": 0.2, "bboxStdDev_0": 0.1, "nmsIOUThreshold": 0.3, "bboxStdDev_2": 0.2, "engineName": DetectionLayer, "bboxStdDev_1": 0.1, "maxDetections": 100, "scoreThreshold": 0.7, "bboxStdDev_count": 4]
init(parameters:) ["imageHeight": 1024, "imageWidth": 1024, "poolSize": 14, "engineName": PyramidROIAlignLayer]
init(parameters:) ["engineName": TimeDistributedMaskLayer]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] (Function)
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] (Function)
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[0, 0, 0, 7, 7]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[0, 0, 0, 7, 7]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0]] [[0, 0, 1, 1, 6]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0]] [[0, 0, 1, 1, 6]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[100, 0, 6, 1, 1]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[100, 0, 6, 1, 1]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[0, 0, 0, 14, 14]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[0, 0, 0, 14, 14]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[1, 0, 0, 0, 0]]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[1, 0, 0, 0, 0]]
init(parameters:) ["preNMSMaxProposals": 6000, "bboxStdDev_3": 0.2, "bboxStdDev_0": 0.1, "bboxStdDev_1": 0.1, "bboxStdDev_count": 4, "nmsIOUThreshold": 0.7, "engineName": ProposalLayer, "bboxStdDev_2": 0.2, "maxProposals": 1000]
init(parameters:) ["poolSize": 7, "engineName": PyramidROIAlignLayer, "imageHeight": 1024, "imageWidth": 1024]
init(parameters:) ["engineName": TimeDistributedClassifierLayer]
init(parameters:) ["maxDetections": 100, "bboxStdDev_2": 0.2, "bboxStdDev_0": 0.1, "bboxStdDev_3": 0.2, "bboxStdDev_count": 4, "scoreThreshold": 0.7, "nmsIOUThreshold": 0.3, "engineName": DetectionLayer, "bboxStdDev_1": 0.1]
init(parameters:) ["poolSize": 14, "engineName": PyramidROIAlignLayer, "imageWidth": 1024, "imageHeight": 1024]
init(parameters:) ["engineName": TimeDistributedMaskLayer]
init(parameters:) ["bboxStdDev_1": 0.1, "nmsIOUThreshold": 0.7, "engineName": ProposalLayer, "preNMSMaxProposals": 6000, "maxProposals": 1000, "bboxStdDev_3": 0.2, "bboxStdDev_0": 0.1, "bboxStdDev_count": 4, "bboxStdDev_2": 0.2]
init(parameters:) ["imageHeight": 1024, "imageWidth": 1024, "poolSize": 7, "engineName": PyramidROIAlignLayer]
init(parameters:) ["engineName": TimeDistributedClassifierLayer]
init(parameters:) ["engineName": DetectionLayer, "bboxStdDev_1": 0.1, "bboxStdDev_3": 0.2, "nmsIOUThreshold": 0.3, "bboxStdDev_count": 4, "bboxStdDev_0": 0.1, "bboxStdDev_2": 0.2, "scoreThreshold": 0.7, "maxDetections": 100]
init(parameters:) ["imageHeight": 1024, "engineName": PyramidROIAlignLayer, "poolSize": 14, "imageWidth": 1024]
init(parameters:) ["engineName": TimeDistributedMaskLayer]
outputShapes(forInputShapes:) [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0]] [[1, 0, 0, 0, 0]]
outputShapes(forInputShapes:) [[261888, 1, 2, 1, 1], [261888, 1, 4, 1, 1]] (Function)
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[1000, 1, 256, 7, 7]]
outputShapes(forInputShapes:) [[1000, 1, 256, 7, 7]] [[1000, 1, 1, 1, 6]]
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1000, 1, 1, 1, 6]] [[100, 1, 6, 1, 1]]
outputShapes(forInputShapes:) [[100, 1, 6, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[100, 1, 256, 14, 14]]
outputShapes(forInputShapes:) [[100, 1, 256, 14, 14], [100, 1, 6, 1, 1]] [[1, 1, 100, 28, 28]]
outputShapes(forInputShapes:) [[261888, 1, 2, 1, 1], [261888, 1, 4, 1, 1]] (Function)
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[1000, 1, 256, 7, 7]]
outputShapes(forInputShapes:) [[1000, 1, 256, 7, 7]] [[1000, 1, 1, 1, 6]]
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1000, 1, 1, 1, 6]] [[100, 1, 6, 1, 1]]
outputShapes(forInputShapes:) [[100, 1, 6, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[100, 1, 256, 14, 14]]
outputShapes(forInputShapes:) [[100, 1, 256, 14, 14], [100, 1, 6, 1, 1]] [[1, 1, 100, 28, 28]]
outputShapes(forInputShapes:) [[261888, 1, 2, 1, 1], [261888, 1, 4, 1, 1]] (Function)
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[1000, 1, 256, 7, 7]]
outputShapes(forInputShapes:) [[1000, 1, 256, 7, 7]] [[1000, 1, 1, 1, 6]]
outputShapes(forInputShapes:) [[1000, 1, 4, 1, 1], [1000, 1, 1, 1, 6]] [[100, 1, 6, 1, 1]]
outputShapes(forInputShapes:) [[100, 1, 6, 1, 1], [1, 1, 256, 256, 256], [1, 1, 256, 128, 128], [1, 1, 256, 64, 64], [1, 1, 256, 32, 32]] [[100, 1, 256, 14, 14]]
outputShapes(forInputShapes:) [[100, 1, 256, 14, 14], [100, 1, 6, 1, 1]] [[1, 1, 100, 28, 28]]
evaluate(inputs:outputs:) 2 1
evaluate(inputs:outputs:) 5 1
evaluate(inputs:outputs:) 1 1
evaluate(inputs:outputs:) 2 1
evaluate(inputs:outputs:) 5 1
evaluate(inputs:outputs:) 2 1
[]

@vade
Copy link
Author

vade commented Jul 20, 2020

Closer :

Pyramid ROI layer requires a retained command buffer as opposed to a unretained, and metal GPU resources require manual synchronization on discrete GPU's to get results.

While I'm not matching iOS exactly, I am getting somewhat sensible results. Bit more to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant