-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] what kind of http request method using with file crawling? #20
Comments
I'm not sure that your problem is caused by the proxy but could you try the following command? $ curl -H "Authorization: token <token>" "http://localhost:8080/gitbucket/api/v3/repos/<user name>/<repository name>/contents/<file name>?ref=<commit hash>&large_file=true" The value The value $ curl -H "Authorization: token <token>" "http://localhost:8080/gitbucket/api/v3/repos/<user name>/<repository name>/git/refs/heads/master If you want to learn how Fess gets files more, see GitBucketDataStoreImpl.java. |
thanks @kw-udon. # curl -H "Authorization: token 284530a64e55176f9ed9*********" "http://gitbucket:8080/gitbucket/api/v3/repos/root/name/contents/hoge?ref=efcd9adbec49f73f762b7b2127153593024e4bea&large_file=true"
{"type":"file","name":"hoge","path":"hoge","sha":"efcd9adbec49f73f762b7b2127153593024e4bea","content":"IyBBcHAgYXJ0aWZhY3RzCi9fYnVpbGQKLLmV4cw==","encoding":"base64","download_url":"http://gitbucket:8080/gitbucket/api/v3/repos/root/name/raw/efcd9adbec49f73f762b7b2127153593024e4bea/hoge"} so proxy didn't discard request and refused. |
|
The cause is above. It's a network problem. |
@marevol thanks!
2018-02-15 14:15:37,744 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Accessing http://gitbucket:8080/gitbucket/api/v3/repos/user/repo/contents/hoge?ref=37cce0819cdf0a357e0b5e9bc373030dbfa84cd6&large_file=true
2018-02-15 14:15:37,745 [5DFNjmEBO7Desvq7XhyO-1] DEBUG CookieSpec selected: default
2018-02-15 14:15:37,746 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Connection request: [route: {}->http://gitbucket:8080][total kept alive: 0; route allocated: 0 of 20; total allocated: 0 of 200]
2018-02-15 14:15:37,746 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Connection leased: [id: 1][route: {}->http://gitbucket:8080][total kept alive: 0; route allocated: 1 of 20; total allocated: 1 of 200]
2018-02-15 14:15:37,746 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Opening connection {}->http://gitbucket:8080
2018-02-15 14:15:37,746 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Connecting to gitbucket/IP:8080
2018-02-15 14:15:37,747 [5DFNjmEBO7Desvq7XhyO-1] DEBUG http-outgoing-1: Shutdown connection
2018-02-15 14:15:37,747 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Connection discarded
2018-02-15 14:15:37,748 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Connection released: [id: 1][route: {}->http://gitbucket:8080][total kept alive: 0; route allocated: 0 of 20; total allocated: 0 of 200]
2018-02-15 14:15:37,748 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Cancelling request execution
2018-02-15 14:15:37,748 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Failed to access to http://gitbucket:8080/gitbucket/api/v3/repos/user/repo/contents/hoge?ref=37cce0819cdf0a357e0b5e9bc373030dbfa84cd6&large_file=true
org.codelibs.fess.crawler.exception.CrawlingAccessException: Connection time out(Connect to gitbucket:8080 [gitbucket/IP] failed: Connection refused (Connection refused)): http://gitbucket:8080/gitbucket/api/v3/repos/user/repo/contents/hoge?ref=37cce0819cdf0a357e0b5e9bc373030dbfa84cd6&large_file=true
at org.codelibs.fess.crawler.client.http.HcHttpClient.processHttpMethod(HcHttpClient.java:820) ~[fess-crawler-2.0.1.jar:?]
at org.codelibs.fess.crawler.client.http.HcHttpClient.doHttpMethod(HcHttpClient.java:623) ~[fess-crawler-2.0.1.jar:?]
at org.codelibs.fess.crawler.client.http.HcHttpClient.doGet(HcHttpClient.java:582) ~[fess-crawler-2.0.1.jar:?]
at org.codelibs.fess.crawler.client.AbstractCrawlerClient.execute(AbstractCrawlerClient.java:142) ~[fess-crawler-2.0.1.jar:?]
at org.codelibs.fess.crawler.client.FaultTolerantClient.execute(FaultTolerantClient.java:67) ~[fess-crawler-2.0.1.jar:?]
at org.codelibs.fess.helper.DocumentHelper.processRequest(DocumentHelper.java:148) ~[classes/:?]
at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.storeFileContent(GitBucketDataStoreImpl.java:291) ~[classes/:?]
at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.lambda$storeData$4713(GitBucketDataStoreImpl.java:134) ~[classes/:?]
at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.crawlFileContents(GitBucketDataStoreImpl.java:441) [classes/:?]
at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.crawlFileContents(GitBucketDataStoreImpl.java:447) [classes/:?]
at org.codelibs.fess.ds.impl.GitBucketDataStoreImpl.storeData(GitBucketDataStoreImpl.java:124) [classes/:?]
at org.codelibs.fess.ds.impl.AbstractDataStoreImpl.store(AbstractDataStoreImpl.java:106) [classes/:?]
at org.codelibs.fess.helper.DataIndexHelper$DataCrawlingThread.process(DataIndexHelper.java:236) [classes/:?]
at org.codelibs.fess.helper.DataIndexHelper$DataCrawlingThread.run(DataIndexHelper.java:222) [classes/:?]
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to gitbucket:8080 [gitbucket/IP] failed: Connection refused (Connection refused)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:159) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[httpclient-4.5.4.jar:4.5.4]
2018-02-15 14:15:37,747 [5DFNjmEBO7Desvq7XhyO-1] DEBUG http-outgoing-1: Shutdown connection
2018-02-15 14:15:37,747 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Connection discarded
2018-02-15 14:15:37,748 [5DFNjmEBO7Desvq7XhyO-1] DEBUG Connection released: [id: 1][route: {}->http://gitbucket:8080][total kept alive: 0; route allocated: 0 of 2
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_161]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_161]
at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[httpclient-4.5.4.jar:4.5.4]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[httpclient-4.5.4.jar:4.5.4]
at org.codelibs.fess.crawler.client.http.HcHttpClient.executeHttpClient(HcHttpClient.java:852) ~[fess-crawler-2.0.1.jar:?]
at org.codelibs.fess.crawler.client.http.HcHttpClient.processHttpMethod(HcHttpClient.java:660) ~[fess-crawler-2.0.1.jar:?]
... 13 more
...
2018-02-15 14:15:42,103 [CoreLib-TimeoutManager] DEBUG Closing expired connections
2018-02-15 14:15:42,105 [CoreLib-TimeoutManager] DEBUG Closing connections idle longer than 60000 MILLISECONDS From this log connection appears to be disconnected by connection timeout or connection refused.
<configuration debug="true" scan="true" scanPeriod="60 seconds">
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<!-- encoders are by default assigned the type
ch.qos.logback.classic.encoder.PatternLayoutEncoder -->
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFO</level>
</filter>
<encoder>
<pattern> %date %-4relative [%thread] %-5level %logger{36} - %msg%n</pattern>
</encoder>
</appender>
<appender name="ROLLING" class="ch.qos.logback.core.rolling.RollingFileAppender">
<!-- encoders are by default assigned the type
ch.qos.logback.classic.encoder.PatternLayoutEncoder -->
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<!-- rollover daily and compress-->
<fileNamePattern>/gitbucket/log/gitbucket-%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
<!-- compressed logs are remains 30 days and then deleted -->
<maxHistory>30</maxHistory>
<timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
<maxFileSize>25MB</maxFileSize>
</timeBasedFileNamingAndTriggeringPolicy>
</rollingPolicy>
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>INFO</level>
</filter>
<encoder>
<pattern>%d{HH:mm:ss.SSS} %-4relative [%thread] %-5level %logger{36} - %msg%n</pattern>
</encoder>
</appender>
<root level="DEBUG">
<appender-ref ref="STDOUT"/>
<appender-ref ref="ROLLING"/>
</root>
</configuration> any ideas? |
Did you configure proxy settings? |
@marevol yes. I configured proxy setting in fess_config.properties http.proxy.host=proxy_IP
http.proxy.port=proxy_port
http.proxy.username=
http.proxy.password=
|
plugin version
1.3.1
gitbucket version
4.20
what is matter
under the proxy environment . I can't get content from files but can get issue, wikis.
fess-crawler.log is as follows,
On Linux, both requests seem to return the same result.
I think that it may be a problem in setting proxy. (Proxy discards file request)
I would like to know about the http request of the file crawl API.
thanks.
The text was updated successfully, but these errors were encountered: